sample_events {linbin}R Documentation

Sample Events

Description

Computes event table variables over the specified sampling intervals, or "bins".

Usage

sample_events(
  e,
  bins,
  ...,
  scaled.cols = NULL,
  col.names = NULL,
  drop.empty = FALSE
)

Arguments

e

An event table.

bins

An event table specifying the intervals for sampling.

...

Lists specifying the sampling functions and parameters to be used (see the Details).

scaled.cols

Names or indices of the event columns to be rescaled after cutting (see cut_events). Names are interpreted as regular expressions (regex) matching full column names.

col.names

Character vector of names for the columns output by the sampling functions. If NULL, the columns are named automatically (see the Details).

drop.empty

If TRUE, bins not intersecting any events are dropped.

Details

Events are cut at bin endpoints, and any scaled.cols columns are rescaled to the length of the resulting event segments. The event segments falling into each bin are passed to the sampling functions to compute the variables for each bin. Bins sample from events they overlap: line events with whom they share more than an endpoint, or point events with equal endpoints (if the bin itself is a point).

Sampling functions are specified in lists with the format list(FUN, data.cols, by = group.cols, ...). The first element in the list is the function to use. It must compute a single value from one or more vectors of the same length. The following unnamed element is a vector specifying the event column names or indices to recursively pass as the first argument of the function. Names are interpreted as regular expressions (regex) matching full column names. Additional unnamed elements are vectors specifying additional event columns to pass as the second, third, ... argument of the function. The first "by" element is a vector of event column names or indices used as grouping variables. Any additional named arguments are passed directly to the function. For example:

list(sum, 1:2, na.rm = TRUE) => sum(events[1], na.rm = TRUE), sum(events[2], na.rm = TRUE) list(sum, 1, 3:4, 5) => sum(events[1], events[3], events[4], events[5]), ... list(sum, c('x', 'y'), by = 3:4) => list(sum, 'x'), list(sum, 'y') grouped into all combinations of columns 3 and 4

Using the latter example above, column names are taken from the first argument (e.g. x, y), and all grouping variables are appended (e.g. x.a, y.a, x.b, y.b), where a and b are the levels of columns 3 and 4. NA is also treated as a factor level. Columns are added left to right in order of the sampling function arguments. Finally, names are made unique by appending sequence numbers to duplicates (using make.unique).

Value

The bins event table with the columns output by the sampling functions appended.

See Also

seq_events to generate sequential bins.

Examples

e <- events(from = c(0, 10, 15, 25), to = c(10, 20, 25, 40), length = c(10, 10, 10, 15), 
            x = c(1, 2, 1, 1), f = c('a', 'b', 'a', 'a'))
bins <- rbind(seq_events(event_coverage(e), 4), c(18, 18))
sample_events(e, bins, list(sum, 'length'))
sample_events(e, bins, list(sum, 'length'), scaled.cols = 'length')
sample_events(e, bins, list(sum, 'length', by = 'f'), scaled.cols = 'length')
sample_events(e, bins, list(weighted.mean, 'x', 'length'), scaled.cols = 'length')
sample_events(e, bins, list(paste0, 'f', collapse = "."))

[Package linbin version 0.1.3 Index]