R: read Squid files

read_squid {webreadr}

R Documentation

read Squid files

Description

the Squid default log formats are either the CLF - for which, use read_clf - or the "native" Squid format, which is described in more detail below. read_squid allows you to read the latter.

Usage

read_squid(file, has_header = FALSE)

Arguments

`file`	the full path to the CLF-formatted file you want to read.
`has_header`	whether or not the file has a header row. Set to FALSE by default.

Details

The log format for Squid servers can be custom-set, but by default follows one of two patterns; it's either the Common Log Format (CLF), which you can read in with read_clf, or the "native log format", a Squid-specific format handled by this function. It consists of the fields:

timestamp: the timestamp identifying when the request was received. This is stored (from the file's point of view) as a count of seconds, in UNIX time: read_squid turns them into POSIXlt timestamps, assuming UTC as an origin timezone.
time_elapsed: the amount of time (in milliseconds) that the connection and fulfilment of the request lasted for.
ip_address: the IP address of the remote host making the request.
status_code: the status code and Squid response code associated with that request, stored as a single field. This can be split into two distinct fields with split_squid
bytes_sent: the number of bytes sent
http_method: the HTTP method (POST, GET, etc) used.
url: the URL of the requested asset.
remote_user_ident: the RFC 1413 remote user identifier.
peer_info: the status of how forwarding to a peer server was handled and, if the request was forwarded, the server it was sent to.

Examples

#Read in an example Squid file provided with the webreadr package.
data <- read_squid(system.file("extdata/log.squid", package = "webreadr"))