stackL {survML} | R Documentation |
Estimate a conditional survival function via local survival stacking
Description
Estimate a conditional survival function via local survival stacking
Usage
stackL(
time,
event = rep(1, length(time)),
entry = NULL,
X,
newX,
newtimes,
direction = "prospective",
bin_size = NULL,
time_basis = "continuous",
learner = "SuperLearner",
SL_control = list(SL.library = c("SL.mean"), V = 10, method = "method.NNLS", stratifyCV
= FALSE),
tau = NULL
)
Arguments
time |
|
event |
|
entry |
Study entry variable, if applicable. Defaults to |
X |
|
newX |
|
newtimes |
|
direction |
Whether the data come from a prospective or retrospective study.
This determines whether the data are treated as subject to left truncation and
right censoring ( |
bin_size |
Size of bins for the discretization of time. A value between 0 and 1 indicating the size of observed event time quantiles on which to grid times (e.g. 0.02 creates a grid of 50 times evenly spaced on the quantile scaled). If NULL, defaults to every observed event time. |
time_basis |
How to treat time for training the binary
classifier. Options are |
learner |
Which binary regression algorithm to use. Currently, only
|
SL_control |
Named list of parameters controlling the Super Learner fitting
process. These parameters are passed directly to the |
tau |
The maximum time of interest in a study, used for
retrospective conditional survival estimation. Rather than dealing
with right truncation separately than left truncation, it is simpler to
estimate the survival function of |
Value
A named list of class stackL
.
S_T_preds |
An |
fit |
The Super Learner fit for binary classification on the stacked dataset. |
References
Polley E.C. and van der Laan M.J. (2011). "Super Learning for Right-Censored Data" in Targeted Learning.
Craig E., Zhong C., and Tibshirani R. (2021). "Survival stacking: casting survival analysis as a classification problem."
Examples
# This is a small simulation example
set.seed(123)
n <- 500
X <- data.frame(X1 = rnorm(n), X2 = rbinom(n, size = 1, prob = 0.5))
S0 <- function(t, x){
pexp(t, rate = exp(-2 + x[,1] - x[,2] + .5 * x[,1] * x[,2]), lower.tail = FALSE)
}
T <- rexp(n, rate = exp(-2 + X[,1] - X[,2] + .5 * X[,1] * X[,2]))
G0 <- function(t, x) {
as.numeric(t < 15) *.9*pexp(t,
rate = exp(-2 -.5*x[,1]-.25*x[,2]+.5*x[,1]*x[,2]),
lower.tail=FALSE)
}
C <- rexp(n, exp(-2 -.5 * X[,1] - .25 * X[,2] + .5 * X[,1] * X[,2]))
C[C > 15] <- 15
entry <- runif(n, 0, 15)
time <- pmin(T, C)
event <- as.numeric(T <= C)
sampled <- which(time >= entry)
X <- X[sampled,]
time <- time[sampled]
event <- event[sampled]
entry <- entry[sampled]
# Note that this a very small Super Learner library, for computational purposes.
SL.library <- c("SL.mean", "SL.glm")
fit <- stackL(time = time,
event = event,
entry = entry,
X = X,
newX = X,
newtimes = seq(0, 15, .1),
direction = "prospective",
bin_size = 0.1,
time_basis = "continuous",
SL_control = list(SL.library = SL.library,
V = 5))
plot(fit$S_T_preds[1,], S0(t = seq(0, 15, .1), X[1,]))
abline(0,1,col='red')