R: isolate sections of overlapping intervals

isolateoverlaps {intervalaverage}

R Documentation

isolate sections of overlapping intervals

Description

Given a set of intervals in a table, isolate sections of intervals that are overlapping with other in intervals (optionally, within groups). Returns a data.table that contains intervals which are mutually non-overlapping or exactly overlapping with other intervals (ie there are no partially overlapping intervals) (optionally within groups). Note that this doesn't just return the intersects; the original interval data is conserved such that for each interval/row in x, the return table has one or more non-overlapping intervals that together form the union of that original interval.

Usage

isolateoverlaps(
  x,
  interval_vars,
  group_vars = NULL,
  interval_vars_out = c("start", "end")
)

Arguments

`x`	A data.table containing a set of intervals.
`interval_vars`	A length-2 character vector denoting column names in x. these columns must be of the same class and be integer or IDate. The column name specifying the lower-bound column must be specified first.
`group_vars`	NULL, or a character vector denoting column names in x. These columns serve as grouping variables such that testing for overlaps and subsequent isolation only occur within categories defined by the combination of the group variables.
`interval_vars_out`	The desired column names of the interval columns in the return data.table. By default these columns will be generated to be named `c("start","end")`. If x contains columns with the same name as the desired output column names specified in `interval_vars_out`, the function will return in error to avoid naming confusion in the return table.

Details

All intervals are treated as closed (ie inclusive of the start and end values in the columns specified by interval_vars)

x is not copied but rather passed by reference to function internals but the order of this data.tables is restored on function completion or error.

Value

A data.table with columns interval_vars_out which denote the start and stop period for each new interval. This return table also contains columns in x (including the original interval columns).

Examples

set.seed(23)
x2 <- data.table(addr_id=rep(1:4,each=3),
                exposure_start=rep(c(1L,7L,14L),times=4),
exposure_end=rep(c(7L,14L,21L),times=4),
exposure_value=c(rnorm(12))
)
x2z <- isolateoverlaps(x2,interval_vars=c("exposure_start","exposure_end"),group_vars=c("addr_id"))
x2z
#x2b represents x2 when where exposure values in overlapping intervals have been averaged
x2b <- x2z[, list(exposure_value=mean(exposure_value)),by=c("addr_id","start","end")]

[Package intervalaverage version 0.8.0 Index]