R: Read Amazon S3 Access Logs

read_s3 {webreadr}

R Documentation

Read Amazon S3 Access Logs

read_s3 provides a reader for Amazon's S3 service's access logs, described here.

read_s3(file)

file

the full path to the S3 file you want to read.

S3 access logs contain information about requests to S3 buckets, and follow a standard format described here.

The fields for S3 files are:

owner: the owner of the S3 bucket; a hashed user ID
bucket: the bucket that processed the request.
request_time: the time that a request was received. Formatted as POSIXct timestamps.
remote_ip: the IP address that made the request.
requester: the user ID of the person making the request; Anonymous if the request was not authenticated.
operation: the actual operation performed with the request.
key: the request's key, normally an encoded URL fragment or NA if the operation did not contain a key.
uri: the full URI for the request, as well as the HTTP method and version. split_clf works to split this into a data.frame of 3 columns.
status: the HTTP status code associated with the request.
error: the error code, if an error occurred; NA otherwise. See here for more information about S3 error codes.
sent: the number of bytes returned in response to the request.
size: the total size of the returned object.
time: the number of milliseconds between the request being sent and the response being sent, from the server's perspective.
turn_around: the number of milliseconds the S3 bucket spent processing the request.
referer: the referer associated with the request.
user_agent: the user agent associated with the request.
version_id: the version ID of the request; NA if the requested operation does not involve a version ID.

# Using the inbuilt testing dataset
s3_data <- read_s3(system.file("extdata/s3.log", package = "webreadr"))

[Package webreadr version 0.4.0 Index]