survsim-package {survsim} | R Documentation |
Simulation of simple and complex survival data
Description
Simulation of cohorts in a context of simple and complex survival analysis, multiple events and recurrent events including several covariates, individual heterogeneity and periods at risk before and after the initial time of follow-up.
Distribution | Survival function | Density function | Parametrization |
Weibull | | |
|
Log-normal | | |
|
Log-logistic | ) | |
|
Distribution | Time |
Weibull |
|
Log-normal |
|
Log-logistic |
|
Where is the standard normal cumulative distribution.
In order to simulate censored survival data, two survival distributions are required, one for the uncensored survival times that would be observed if the follow-up had been sufficiently long to reach the event and another representing the censoring mechanism. The uncensored survival distribution, , for
subjects, could be generated to depend on a set of covariates with a specified relationship with survival, which represents the true prognostic importance of each covariate (Burton, 2006). The package allows to simulate times by means of using Weibull (and exponential as a particular case), log-normal and log-logistic distributions, as such is showed in previous table.
To induce individual heterogeneity or within-subject correlation we generate
, a random effect covariate that follows a particular distribution (Uniform or Normal).
When , for all subjects, we are in the case of individual homogeneity and the survival times are completely specified by the covariates.
Random non-informative right censoring,
, can be generated in a similar manner to the uncensored survival times,
, by assuming a particular distribution for the censoring times (previous table), but without including any covariates nor individual heterogenity.
The observation times,
, incorporating both events and censored observations are calculated for each case by combining the uncensored survival times,
, and the censoring times,
. If the uncensored survival time for an observation is less than or equal to the censored time, then the event is considered to be observed and the observation time equals the uncensored survival time, otherwise the event is considered censored and the observation time equals the censored time. In other words, once simulated
and
, we can define
as the obervation time with
an indicator of non-censoring, i.e.
.
While all
start at 0, the package allows create dynamic cohorts. We can generate entry times higher than 0 adding a
value corresponding with an uniform distribution in
. We can also simulate subjects at risk before of the initial time of follow-up
, by including an uniform distribution for
between
for a fixed percentage of subjects. Then:
where follows a uniform distribution in
if entry time is 0 or more and
is uniform distributed in
if entry time is less than 0.
Therefore,
represents the initial point of the episode,
the endpoint and
is the lenght. Note that
can be higher than
, and in this case
will be set at
and
. The observations corresponding to the subjects at risk before of the initial time of follow-up have
negative, then the initial point of the episode will be set at 0.
may also be negative, in this case this episode will not be included in the simulated data, as long as this episode won't be observed in practice.
Details
Package: | survsim |
Type: | Package |
Version: | 1.1.8 |
Date: | 2021-12-13 |
License: | GPL version 2 or newer |
LazyLoad: | yes |
The package provide a tool for simulation of cohorts in a simple single-event context through the function simple.surv.sim
, in a recurrent event context with the function rec.ev.sim
, in a multiple event context with the function mult.ev.sim
and in a competing risks context with the function crisk.sim
, and it also allows the user to generate aggregated data from the simulated cohort, by means of the function accum
.
Author(s)
David Moriña, (Universitat de Barcelona) and Albert Navarro (Universitat Autònoma de Barcelona)
Maintainer: David Moriña Soler <dmorina@ub.edu>
References
Kelly PJ, Lim LL. Survival analysis for recurrent event data: an application to childhood infectious diseases. Stat Med 2000 Jan 15;19(1):13-33.
Bender R, Augustin T, Blettner M. Generating survival times to simulate Cox proportional hazards models. Stat Med 2005 Jun 15;24(11):1713-1723.
Metcalfe C, Thompson SG. The importance of varying the event generation process in simulation studies of statistical methods for recurrent events. Stat Med 2006 Jan 15;25(1):165-179.
Burton A, Altman DG, Royston P, Holder RL. The design of simulation studies in medical statistics. Stat Med 2006 Dec 30;25(24):4279-4292.
Beyersmann J, Latouche A, Buchholz A, Schumacher M. Simulating competing risks data in survival analysis. Stat Med 2009 Jan 5;28(1):956-971.
Reis RJ, Utzet M, La Rocca PF, Nedel FB, Martin M, Navarro A. Previous sick leaves as predictor of subsequent ones. Int Arch Occup Environ Health 2011 Jun;84(5):491-499.
Navarro A, Moriña D, Reis R, Nedel FB, Martin M, Alvarado S. Hazard functions to describe patterns of new and recurrent sick leave episodes for different diagnoses. Scand J Work Environ Health 2012 Jan 27.
Moriña D, Navarro A. The R package survsim for the simulation of simple and complex survival data. Journal of Statistical Software 2014 Jul; 59(2):1-20.