genTDCM {genSurv}R Documentation

Generating data from a Cox model with time-dependent covariates

Description

Generating data from a Cox model with time-dependent covariates.

Usage

genTDCM(n, dist, corr, dist.par, model.cens, cens.par, beta, lambda)

Arguments

n

Sample size.

dist

Bivariate distribution assumed for generating the two covariates (time-fixed and time-dependent). Possible bivariate distributions are "exponential" and "weibull" (see details below).

corr

Correlation parameter. Possible values for the bivariate exponential distribution are between -1 and 1 (0 for independency). Any value between 0 (not included) and 1 (1 for independency) is accepted for the bivariate weibull distribution.

dist.par

Vector of parameters for the allowed distributions. Two (scale) parameters for the bivariate exponential distribution and four (2 shape parameters and 2 scale parameters) for the bivariate weibull distribution: (shape1, scale1, shape2, scale2). See details below.

model.cens

Model for censorship. Possible values are "uniform" and "exponential".

cens.par

Parameter for the censorship distribution. Must be greater than 0.

beta

Vector of two regression parameters for the two covariates.

lambda

Parameter for an exponential distribution. An exponential distribution is assumed for the baseline hazard function.

Details

The bivariate exponential distribution, also known as Farlie-Gumbel-Morgenstern distribution is given by

F(x,y)=F_1(x)F_2(y)[1+\alpha(1-F_1(x))(1-F_2(y))]

for x\ge0 and y\ge0. Where the marginal distribution functions F_1 and F_2 are exponential with scale parameters \theta_1 and \theta_2 and correlation parameter \alpha, -1 \le \alpha \le 1.

The bivariate Weibull distribution with two-parameter marginal distributions. It's survival function is given by

S(x,y)=P(X>x,Y>y)=e^{-[(\frac{x}{\theta_1})^\frac{\beta_1}{\delta}+(\frac{y}{\theta_2})^\frac{\beta_2}{\delta}]^\delta}

Where 0 < \delta \le 1 and each marginal distribution has shape parameter \beta_i and a scale parameter \theta_i, i = 1, 2.

Value

An object with two classes, data.frame and TDCM. To accommodate time-dependent effects, we used a counting process data-structure, introduced by Andersen and Gill (1982). In this data-structure, apart the time-fixed covariates (named covariate), an individual's survival data is expressed by three variables: start, stop and event. Individuals without change in the time-dependent covariate (named tdcov) are represented by only one line of data, whereas patients with a change in the time-dependent covariate must be represented by two lines. For these patients, the first line represents the time period until the change in the time-dependent covariate; the second line represents the time period that passes from that change to the end of the follow-up. For each line of data, variables start and stop mark the time interval (start, stop) for the data, while event is an indicator variable taking on value 1 if there was a death at time stop, and 0 otherwise. More details about this data-structure can be found in papers by (Meira-Machado et al., 2009).

Author(s)

Artur Araújo, Luís Meira Machado and Susana Faria

References

Anderson, P.K., Gill, R.D. (1982). Cox's regression model for counting processes: a large sample study. Annals of Statistics, 10(4), 1100-1120. doi: 10.1214/aos/1176345976

Cox, D.R. (1972). Regression models and life tables. Journal of the Royal Statistical Society: Series B, 34(2), 187-202. doi: 10.1111/j.2517-6161.1972.tb00899.x

Johnson, M. E. (1987). Multivariate Statistical Simulation, John Wiley and Sons.

Johnson, N., Kotz, S. (1972). Distribution in statistics: continuous multivariate distributions, John Wiley and Sons.

Lu J., Bhattacharya G. (1990). Some new constructions of bivariate weibull models. Annals of Institute of Statistical Mathematics, 42(3), 543-559. doi: 10.1007/BF00049307

Meira-Machado, L., Cadarso-Suárez, C., De Uña- Álvarez, J., Andersen, P.K. (2009). Multi-state models for the analysis of time to event data. Statistical Methods in Medical Research, 18(2), 195-222. doi: 10.1177/0962280208092301

Meira-Machado L., Faria S. (2014). A simulation study comparing modeling approaches in an illness-death multi-state model. Communications in Statistics - Simulation and Computation, 43(5), 929-946. doi: 10.1080/03610918.2012.718841

Meira-Machado, L., Sestelo M. (2019). Estimation in the progressive illness-death model: a nonexhaustive review. Biometrical Journal, 61(2), 245–263. doi: 10.1002/bimj.201700200

Therneau, T.M., Grambsch, P.M. (2000). Modelling survival data: Extending the Cox Model, New York: Springer.

See Also

genCMM, genCPHM, genTHMM.

Examples

tdcmdata <- genTDCM(n=1000, dist="weibull", corr=0.8, dist.par=c(2,3,2,3),
model.cens="uniform", cens.par=2.5, beta=c(-3.3,4), lambda=1)
head(tdcmdata, n=20L)
library(survival)
fit1<-coxph(Surv(start,stop,event)~tdcov+covariate,data=tdcmdata)
summary(fit1)

tdcmdata2 <- genTDCM(n=1000, dist="exponential", corr=0, dist.par=c(1,1),
model.cens="uniform", cens.par=1, beta=c(-3,2), lambda=0.5)
head(tdcmdata2, n=20L)
fit2<-coxph(Surv(start,stop,event)~tdcov+covariate,data=tdcmdata2)
summary(fit2)

[Package genSurv version 1.0.4 Index]