| splitMulti {popEpi} | R Documentation |
Split case-level observations
Description
Split a Lexis object along multiple time scales
with speed and ease
Usage
splitMulti(
data,
breaks = NULL,
...,
drop = TRUE,
merge = TRUE,
verbose = FALSE
)
Arguments
data |
a Lexis object with event cases as rows |
breaks |
a list of named numeric vectors of breaks; see Details and Examples |
... |
alternate way of supplying breaks as named vectors;
e.g. |
drop |
logical; if |
merge |
logical; if |
verbose |
logical; if |
Details
splitMulti is in essence a data.table version of
splitLexis or survSplit for splitting along multiple
time scales.
It requires a Lexis object as input.
The breaks must be a list of named vectors of the appropriate type.
The breaks are fully explicit and
left-inclusive and right exclusive, e.g. fot=c(0,5)
forces the data to only include time between
[0,5) for each original row (unless drop = FALSE).
Use Inf or -Inf for open-ended intervals,
e.g. per=c(1990,1995,Inf) creates the intervals
[1990,1995), [1995, Inf).
Instead of specifying breaks, one may make use of the ...
argument to pass breaks: e.g.
splitMulti(x, breaks = list(fot = 0:5))
is equivalent to
splitMulti(x, fot = 0:5).
Multiple breaks can be supplied in the same manner. However, if both
breaks and ... are used, only the breaks in breaks
are utilized within the function.
The Lexis time scale variables can be of any arbitrary
format, e.g. Date,
fractional years (see cal.yr) and get.yrs,
or other. However, using date variables (from package date)
are not recommended, as date variables are always stored as integers,
whereas Date variables (see ?as.Date) are typically stored
in double ("numeric") format. This allows for breaking days into fractions
as well, when using e.g. hypothetical years of 365.25 days.
Value
A data.table or data.frame
(depending on options("popEpi.datatable"); see ?popEpi)
object expanded to accommodate split observations.
Author(s)
Joonas Miettinen
See Also
Other splitting functions:
lexpand(),
splitLexisDT()
Examples
#### let's prepare data for computing period method survivals
#### in case there are problems with dates, we first
#### convert to fractional years.
library("Epi")
library("data.table")
data("sire", package = "popEpi")
x <- Lexis(data=sire[dg_date < ex_date, ],
entry = list(fot=0, per=get.yrs(dg_date), age=dg_age),
exit=list(per=get.yrs(ex_date)), exit.status=status)
x2 <- splitMulti(x, breaks = list(fot=seq(0, 5, by = 3/12), per=c(2008, 2013)))
# equivalently:
x2 <- splitMulti(x, fot=seq(0, 5, by = 3/12), per=c(2008, 2013))
## using dates; note: breaks must be expressed as dates or days!
x <- Lexis(data=sire[dg_date < ex_date, ],
entry = list(fot=0, per=dg_date, age=dg_date-bi_date),
exit=list(per=ex_date), exit.status=status)
BL <- list(fot = seq(0, 5, by = 3/12)*365.242199,
per = as.Date(paste0(c(1980:2014),"-01-01")),
age = c(0,45,85,Inf)*365.242199)
x2 <- splitMulti(x, breaks = BL, verbose=TRUE)
## multistate example (healty - sick - dead)
sire2 <- data.frame(sire)
sire2 <- sire2[sire2$dg_date < sire2$ex_date, ]
set.seed(1L)
not_sick <- sample.int(nrow(sire2), 6000L, replace = FALSE)
sire2$dg_date[not_sick] <- NA
sire2$status[!is.na(sire2$dg_date) & sire2$status == 0] <- -1
sire2$status[sire2$status==2] <- 1
sire2$status <- factor(sire2$status, levels = c(0, -1, 1),
labels = c("healthy", "sick", "dead"))
xm <- Lexis(data = sire2,
entry = list(fot=0, per=get.yrs(bi_date), age=0),
exit = list(per=get.yrs(ex_date)), exit.status=status)
xm2 <- cutLexis(xm, cut = get.yrs(xm$dg_date),
timescale = "per",
new.state = "sick")
xm2[xm2$lex.id == 6L, ]
xm2 <- splitMulti(xm2, breaks = list(fot = seq(0,150,25)))
xm2[xm2$lex.id == 6L, ]