genTDCM {genSurv} | R Documentation |
Generating data from a Cox model with time-dependent covariates
Description
Generating data from a Cox model with time-dependent covariates.
Usage
genTDCM(n, dist, corr, dist.par, model.cens, cens.par, beta, lambda)
Arguments
n |
Sample size. |
dist |
Bivariate distribution assumed for generating the two covariates (time-fixed and time-dependent). Possible bivariate distributions are "exponential" and "weibull" (see details below). |
corr |
Correlation parameter. Possible values for the bivariate exponential distribution are between -1 and 1 (0 for independency). Any value between 0 (not included) and 1 (1 for independency) is accepted for the bivariate weibull distribution. |
dist.par |
Vector of parameters for the allowed distributions. Two (scale) parameters for the bivariate exponential distribution and four (2 shape parameters and 2 scale parameters) for the bivariate weibull distribution: (shape1, scale1, shape2, scale2). See details below. |
model.cens |
Model for censorship. Possible values are "uniform" and "exponential". |
cens.par |
Parameter for the censorship distribution. Must be greater than 0. |
beta |
Vector of two regression parameters for the two covariates. |
lambda |
Parameter for an exponential distribution. An exponential distribution is assumed for the baseline hazard function. |
Details
The bivariate exponential distribution, also known as Farlie-Gumbel-Morgenstern distribution is given by
F(x,y)=F_1(x)F_2(y)[1+\alpha(1-F_1(x))(1-F_2(y))]
for x\ge0
and y\ge0
. Where the marginal distribution functions F_1
and F_2
are exponential with scale parameters \theta_1
and \theta_2
and correlation parameter \alpha
, -1 \le \alpha \le 1
.
The bivariate Weibull distribution with two-parameter marginal distributions. It's survival function is given by
S(x,y)=P(X>x,Y>y)=e^{-[(\frac{x}{\theta_1})^\frac{\beta_1}{\delta}+(\frac{y}{\theta_2})^\frac{\beta_2}{\delta}]^\delta}
Where 0 < \delta \le 1
and each marginal distribution has shape parameter \beta_i
and a scale parameter \theta_i
, i = 1, 2
.
Value
An object with two classes, data.frame
and TDCM
.
To accommodate time-dependent effects, we used a counting process data-structure, introduced by Andersen and Gill (1982).
In this data-structure, apart the time-fixed covariates (named covariate
), an individual's survival data is expressed by three variables:
start
, stop
and event
. Individuals without change in the time-dependent covariate (named tdcov
) are represented by only one line of data,
whereas patients with a change in the time-dependent covariate must be represented by two lines.
For these patients, the first line represents the time period until the change in the time-dependent covariate;
the second line represents the time period that passes from that change to the end of the follow-up.
For each line of data, variables start
and stop
mark the time interval (start, stop) for the data,
while event is an indicator variable taking on value 1 if there was a death at time stop, and 0 otherwise.
More details about this data-structure can be found in papers by (Meira-Machado et al., 2009).
Author(s)
Artur Araújo, Luís Meira Machado and Susana Faria
References
Anderson, P.K., Gill, R.D. (1982). Cox's regression model for counting processes: a large sample study. Annals of Statistics, 10(4), 1100-1120. doi: 10.1214/aos/1176345976
Cox, D.R. (1972). Regression models and life tables. Journal of the Royal Statistical Society: Series B, 34(2), 187-202. doi: 10.1111/j.2517-6161.1972.tb00899.x
Johnson, M. E. (1987). Multivariate Statistical Simulation, John Wiley and Sons.
Johnson, N., Kotz, S. (1972). Distribution in statistics: continuous multivariate distributions, John Wiley and Sons.
Lu J., Bhattacharya G. (1990). Some new constructions of bivariate weibull models. Annals of Institute of Statistical Mathematics, 42(3), 543-559. doi: 10.1007/BF00049307
Meira-Machado, L., Cadarso-Suárez, C., De Uña- Álvarez, J., Andersen, P.K. (2009). Multi-state models for the analysis of time to event data. Statistical Methods in Medical Research, 18(2), 195-222. doi: 10.1177/0962280208092301
Meira-Machado L., Faria S. (2014). A simulation study comparing modeling approaches in an illness-death multi-state model. Communications in Statistics - Simulation and Computation, 43(5), 929-946. doi: 10.1080/03610918.2012.718841
Meira-Machado, L., Sestelo M. (2019). Estimation in the progressive illness-death model: a nonexhaustive review. Biometrical Journal, 61(2), 245–263. doi: 10.1002/bimj.201700200
Therneau, T.M., Grambsch, P.M. (2000). Modelling survival data: Extending the Cox Model, New York: Springer.
See Also
Examples
tdcmdata <- genTDCM(n=1000, dist="weibull", corr=0.8, dist.par=c(2,3,2,3),
model.cens="uniform", cens.par=2.5, beta=c(-3.3,4), lambda=1)
head(tdcmdata, n=20L)
library(survival)
fit1<-coxph(Surv(start,stop,event)~tdcov+covariate,data=tdcmdata)
summary(fit1)
tdcmdata2 <- genTDCM(n=1000, dist="exponential", corr=0, dist.par=c(1,1),
model.cens="uniform", cens.par=1, beta=c(-3,2), lambda=0.5)
head(tdcmdata2, n=20L)
fit2<-coxph(Surv(start,stop,event)~tdcov+covariate,data=tdcmdata2)
summary(fit2)