splitMulti {popEpi} | R Documentation |
Split case-level observations
Description
Split a Lexis
object along multiple time scales
with speed and ease
Usage
splitMulti(
data,
breaks = NULL,
...,
drop = TRUE,
merge = TRUE,
verbose = FALSE
)
Arguments
data |
a Lexis object with event cases as rows |
breaks |
a list of named numeric vectors of breaks; see Details and Examples |
... |
alternate way of supplying breaks as named vectors;
e.g. |
drop |
logical; if |
merge |
logical; if |
verbose |
logical; if |
Details
splitMulti
is in essence a data.table version of
splitLexis
or survSplit
for splitting along multiple
time scales.
It requires a Lexis object as input.
The breaks
must be a list of named vectors of the appropriate type.
The breaks are fully explicit and
left-inclusive and right exclusive, e.g. fot=c(0,5)
forces the data to only include time between
[0,5)
for each original row (unless drop = FALSE
).
Use Inf
or -Inf
for open-ended intervals,
e.g. per=c(1990,1995,Inf)
creates the intervals
[1990,1995), [1995, Inf)
.
Instead of specifying breaks
, one may make use of the ...
argument to pass breaks: e.g.
splitMulti(x, breaks = list(fot = 0:5))
is equivalent to
splitMulti(x, fot = 0:5)
.
Multiple breaks can be supplied in the same manner. However, if both
breaks
and ...
are used, only the breaks in breaks
are utilized within the function.
The Lexis
time scale variables can be of any arbitrary
format, e.g. Date
,
fractional years (see cal.yr
) and get.yrs
,
or other. However, using date
variables (from package date)
are not recommended, as date
variables are always stored as integers,
whereas Date
variables (see ?as.Date
) are typically stored
in double ("numeric") format. This allows for breaking days into fractions
as well, when using e.g. hypothetical years of 365.25 days.
Value
A data.table
or data.frame
(depending on options("popEpi.datatable")
; see ?popEpi
)
object expanded to accommodate split observations.
Author(s)
Joonas Miettinen
See Also
Other splitting functions:
lexpand()
,
splitLexisDT()
Examples
#### let's prepare data for computing period method survivals
#### in case there are problems with dates, we first
#### convert to fractional years.
library("Epi")
library("data.table")
data("sire", package = "popEpi")
x <- Lexis(data=sire[dg_date < ex_date, ],
entry = list(fot=0, per=get.yrs(dg_date), age=dg_age),
exit=list(per=get.yrs(ex_date)), exit.status=status)
x2 <- splitMulti(x, breaks = list(fot=seq(0, 5, by = 3/12), per=c(2008, 2013)))
# equivalently:
x2 <- splitMulti(x, fot=seq(0, 5, by = 3/12), per=c(2008, 2013))
## using dates; note: breaks must be expressed as dates or days!
x <- Lexis(data=sire[dg_date < ex_date, ],
entry = list(fot=0, per=dg_date, age=dg_date-bi_date),
exit=list(per=ex_date), exit.status=status)
BL <- list(fot = seq(0, 5, by = 3/12)*365.242199,
per = as.Date(paste0(c(1980:2014),"-01-01")),
age = c(0,45,85,Inf)*365.242199)
x2 <- splitMulti(x, breaks = BL, verbose=TRUE)
## multistate example (healty - sick - dead)
sire2 <- data.frame(sire)
sire2 <- sire2[sire2$dg_date < sire2$ex_date, ]
set.seed(1L)
not_sick <- sample.int(nrow(sire2), 6000L, replace = FALSE)
sire2$dg_date[not_sick] <- NA
sire2$status[!is.na(sire2$dg_date) & sire2$status == 0] <- -1
sire2$status[sire2$status==2] <- 1
sire2$status <- factor(sire2$status, levels = c(0, -1, 1),
labels = c("healthy", "sick", "dead"))
xm <- Lexis(data = sire2,
entry = list(fot=0, per=get.yrs(bi_date), age=0),
exit = list(per=get.yrs(ex_date)), exit.status=status)
xm2 <- cutLexis(xm, cut = get.yrs(xm$dg_date),
timescale = "per",
new.state = "sick")
xm2[xm2$lex.id == 6L, ]
xm2 <- splitMulti(xm2, breaks = list(fot = seq(0,150,25)))
xm2[xm2$lex.id == 6L, ]