group_time {epidm} | R Documentation |
Grouping of intervals or events in time together
Description
Group across multiple observations of overlapping time intervals, with defined start and end dates, or events within a static/fixed or rolling window of time.
Usage
group_time(
x,
date_start,
date_end,
window,
window_type = c("rolling", "static"),
group_vars,
indx_varname = "indx",
min_varname = "date_min",
max_varname = "date_max",
.forceCopy = FALSE
)
Arguments
x |
data frame, this will be converted to a data.table |
date_start |
column containing the start dates for the grouping, provided quoted |
date_end |
column containing the end dates for the interval, quoted |
window |
an integer representing a time window in days which will be applied to the start date for grouping events |
window_type |
character, to determine if a 'rolling' or 'static' grouping method should be used when grouping events |
group_vars |
in a vector, the all columns used to group records, quoted |
indx_varname |
a character string to set variable name for the index column which provides a grouping key; default is indx |
min_varname |
a character string to set variable name for the time period minimum |
max_varname |
a character string set variable name for the time period maximum |
.forceCopy |
default FALSE; TRUE will force data.table to take a copy instead of editing the data without reference |
Value
the original data.frame as a data.table with the following new fields:
indx
; renamed usingindx_varname
an id field for the new aggregated events/intervals; note that where the
date_start
is NA, anindx
value will also be NAmin_date
; renamed usingmin_varname
the start date for the aggregated events/intervals
max_date
; renamed usingmax_varname
the end date for the aggregated events/intervals
Examples
episode_test <- structure(
list(
pat_id = c(1L, 1L, 1L, 1L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 2L, 2L, 2L),
species = c(rep("E. coli",7),rep("K. pneumonia",7)),
spec_type = c(rep("Blood",7),rep("Blood",4),rep("Sputum",3)),
sp_date = structure(c(18262, 18263, 18281, 18282, 18262, 18263, 18281,
18265, 18270, 18281, 18283, 18259, 18260, 18281),
class = "Date")
),
row.names = c(NA, -14L), class = "data.frame")
group_time(x=episode_test,
date_start='sp_date',
window=14,
window_type = 'static',
indx_varname = 'static_indx',
group_vars=c('pat_id','species','spec_type'))[]
spell_test <- data.frame(
id = c(rep(99,6),rep(88,4),rep(3,3)),
provider = c("YXZ",rep("ZXY",5),rep("XYZ",4),rep("YZX",3)),
spell_start = as.Date(
c(
"2020-03-01",
"2020-07-07",
"2020-02-08",
"2020-04-28",
"2020-03-15",
"2020-07-01",
"2020-01-01",
"2020-01-12",
"2019-12-25",
"2020-03-28",
"2020-01-01",
rep(NA,2)
)
),
spell_end = as.Date(
c(
"2020-03-10",
"2020-07-26",
"2020-05-22",
"2020-04-30",
"2020-05-20",
"2020-07-08",
"2020-01-23",
"2020-03-30",
"2020-01-02",
"2020-04-20",
"2020-01-01",
rep(NA,2)
)
)
)
group_time(x = spell_test,
date_start = 'spell_start',
date_end = 'spell_end',
group_vars = c('id','provider'),
indx_varname = 'spell_id',
min_varname = 'spell_min_date',
max_varname = 'spell_max_date')[]