survrtrunc {flexsurv}R Documentation

Nonparametric estimator of survival from right-truncated, uncensored data

Description

Estimates the survivor function from right-truncated, uncensored data by reversing time, interpreting the data as left-truncated, applying the Kaplan-Meier / Lynden-Bell estimator and transforming back.

Usage

survrtrunc(t, rtrunc, tmax, data = NULL, eps = 0.001, conf.int = 0.95)

Arguments

t

Vector of observed times from an initial event to a final event.

rtrunc

Individual-specific right truncation points, so that each individual's survival time t would not have been observed if it was greater than the corresponding element of rtrunc. If any of these are greater than tmax, then the actual individual-level truncation point for these individuals is taken to be tmax.

tmax

Maximum possible time to event that could have been observed.

data

Data frame to find t and rtrunc in. If not supplied, these should be in the working environment.

eps

Small number that is added to t before implementing the time-reversed estimator, to ensure the risk set is consistent between forward and reverse time scales. It should be just large enough that t+eps is not ==t. This should not need changing from the default of 0.001, unless t are extremely large or small and the data are rounded to integer.

conf.int

Confidence level, defaulting to 0.95.

Details

Note that this does not estimate the untruncated survivor function - instead it estimates the survivor function truncated above at a time defined by the maximum possible time that might have been observed in the data.

Define X as the time of the initial event, Y as the time of the final event, then we wish to determine the distribution of T = Y- X.

Observations are only recorded if Y \leq t_{max}. Then the distribution of T in the resulting sample is right-truncated by rtrunc = t_{max} - X.

Equivalently, the distribution of t_{max} - T is left-truncated, since it is only observed if t_{max} - T \geq X. Then the standard Kaplan-Meier type estimator as implemented in survfit is used (as described by Lynden-Bell, 1971) and the results transformed back.

This situation might happen in a disease epidemic, where X is the date of disease onset for an individual, Y is the date of death, and we wish to estimate the distribution of the time T from onset to death, given we have only observed people who have died by the date t_{max}.

If the estimated survival is unstable at the highest times, then consider replacing tmax by a slightly lower value, then if necessary, removing individuals with t > tmax, so that the estimand is changed to the survivor function truncated over a slightly narrower interval.

Value

A list with components:

time Time points where the estimated survival changes.

surv Estimated survival at time, truncated above at tmax.

se.surv Standard error of survival.

std.err Standard error of -log(survival). Named this way for consistency with survfit.

lower Lower confidence limits for survival.

upper Upper confidence limits for survival.

References

D. Lynden-Bell (1971) A method of allowing for known observational selection in small samples applied to 3CR quasars. Monthly Notices of the Royal Astronomical Society, 155:95–118.

Seaman, S., Presanis, A. and Jackson, C. (2020) Review of methods for estimating distribution of time to event from right-truncated data.

Examples


## simulate some event time data
set.seed(1)
X <- rweibull(100, 2, 10)
T <- rweibull(100, 2, 10)

## truncate above
tmax <- 20
obs <- X + T < tmax
rtrunc <- tmax - X
dat <- data.frame(X, T, rtrunc)[obs,]
sf <-    survrtrunc(T, rtrunc, data=dat, tmax=tmax)
plot(sf, conf.int=TRUE)
## Kaplan-Meier estimate ignoring truncation is biased
sfnaive <- survfit(Surv(T) ~ 1, data=dat)
lines(sfnaive, conf.int=TRUE, lty=2, col="red")

## truncate above the maximum observed time
tmax <- max(X + T) + 10
obs <- X + T < tmax
rtrunc <- tmax - X
dat <- data.frame(X, T, rtrunc)[obs,]
sf <-    survrtrunc(T, rtrunc, data=dat, tmax=tmax)
plot(sf, conf.int=TRUE)
## estimates identical to the standard Kaplan-Meier
sfnaive <- survfit(Surv(T) ~ 1, data=dat)
lines(sfnaive, conf.int=TRUE, lty=2, col="red")



[Package flexsurv version 2.3 Index]