parse_logs {tabulog} | R Documentation |
Parse Log Files
Description
Parse a log file with a provided template and a set of classes
Usage
parse_logs(text, template, classes = list(), ...)
parse_logs_file(text_file, config_file, formatters = list(), ...)
Arguments
text |
Character vector; each element a log record |
template |
Template string |
classes |
A named list of parsers or regex strings for use within the template string |
... |
Other arguments passed onto |
text_file |
Filename (or readable connection) containing log text |
config_file |
Filename (or readable connection) containing template file |
formatters |
Named list of formatter functions for use of formatting |
Details
'template
should only be a template string, such as
'ip ip_address [date access_date]...'.
config_file
should be a yaml file or connection with the following fields
template: Template String
classes: Named list of regex strings for building classes
text
should be a character vector, with each element representing a
a log record
text_file
should be a file or connection that can be split (with readLines)
into a character vector of records
classes
should be a named list of parser objects, where names
match names of classes in template string, or a similarly
named list of regex strings for coercing into parsers
formatters
should be a named list of functions, where names
match names of classes in template string, for properly
formatting fields once they have been captured
Value
A data.frame with each field identified in the template string as a column.
For each record in the passed text, the fields were extracted and formatted
using the parser objects in default_classes()
and classes
.
Examples
# Template string with two fields
template <- '{{ip ipAddress}} - [{{date accessDate}}] {{int status }}'
# Two simple log records
logs <- c(
'192.168.1.10 - [26/Jul/2019:11:41:10 -0500] 200',
'192.168.1.11 - [26/Jul/2019:11:41:21 -0500] 404'
)
# A formatter for the date field
myFormatters <- list(date = function(x) lubridate::as_datetime(x, format = '%d/%b/%Y:%H:%M:%S %z'))
# A parser class for the date field
date_parser <- parser(
'[0-3][0-9]\\/[A-Z][a-z]{2}\\/[0-9]{4}:[0-9]{2}:[0-9]{2}:[0-9]{2}[ ][\\+|\\-][0-9]{4}',
myFormatters$date,
'date'
)
# Parse the logs from raw data
parse_logs(logs, template, list(date=date_parser))
# Write the logs and to file and parse
logfile <- tempfile()
templatefile <- tempfile()
writeLines(logs, logfile)
yaml::write_yaml(list(template=template, classes=list(date=date_parser)), templatefile)
parse_logs_file(logfile, templatefile, myFormatters)
file.remove(logfile)
file.remove(templatefile)