simu {rocTree}R Documentation

Function to generate simulated data used in the manuscript.

Description

This function is used to generate simulated data under various settings. Let Z be a p-dimensional vector of possible time-dependent covariates and \beta be the vector of regression coefficient. The survival times (T) are generated from the hazard function specified as follow:

Scenario 1.1

Proportional hazards model:

\lambda(t|Z) = \lambda_0(t) e^{-0.5 Z_1 + 0.5 Z_2 - 0.5 Z_3 ... + 0.5 Z_{10}},

where \lambda_0(t) = 2t.

Scenario 1.2

Proportional hazards model with noise variable:

\lambda(t|Z) = \lambda_0(t) e^{2Z_1 + 2Z_2 + 0Z_3 + ... + 0Z_{10}},

where \lambda_0(t) = 2t.

Scenario 1.3

Proportional hazards model with nonlinear covariate effects:

\lambda(t|Z) = \lambda_0(t) e^{[2\sin(2\pi Z_1) + 2|Z_2 - 0.5|]},

where \lambda_0(t) = 2t.

Scenario 1.4

Accelerated failure time model:

\log(T) = -2 + 2Z_1 + 2Z_2 + \epsilon,

where \epsilon follows N(0, 0.5^2).

Scenario 1.5

Generalized gamma family:

T = e^{\sigma\omega},

where \omega = \log(Q^2 g) / Q, g follows Gamma(Q^{-2}, 1), \sigma = 2Z_1, Q = 2Z_2.

Scenario 2.1

Dichotomous time dependent covariate with at most one change in value:

\lambda(t|Z(t)) = \lambda_0(t)e^{2Z_1(t) + 2Z_2},

where Z_1(t) is the time-dependent covariate: Z_1(t) = \theta I(t \ge U_0) + (1 - \theta) I(t < U_0), ,\theta is a Bernoulli variable with equal probability, and U_0 follows a uniform distribution over [0, 1].

Scenario 2.2

Dichotomous time dependent covariate with multiple changes:

\lambda(t|Z(t)) = e^{2Z_1(t) + 2Z_2},

where Z_1(t) = \theta[I(U_1\le t < U_2) + I(U_3 \le t)] + (1 - \theta)[I(t < U_1) + I(U_2\le t < U_3)], \theta is a Bernoulli variable with equal probability, and U_1\le U_2\le U_3 are the first three terms of a stationary Poisson process with rate 10.

Scenario 2.3

Proportional hazard model with a continuous time dependent covariate:

\lambda(t|Z(t)) = 0.1 e^{Z_1(t) + Z_2},

where Z_1(t) = kt + b, k and b are independent uniform random variables over [1, 2].

Scenario 2.4

Non-proportional hazards model with a continuous time dependent covariate:

\lambda(t|Z(t)) = 0.1 \cdot[1 + \sin\{Z_1(t) + Z_2\}],

where Z_1(t) = kt + b, k and b follow independent uniform distributions over [1, 2].

Scenario 2.5

Non-proportional hazards model with a nonlinear time dependent covariate:

\lambda(t|Z(t)) = 0.1 \cdot[1 + \sin\{Z_1(t) + Z_2\}],

where Z_1(t) = 2kt\cdot \{I(t > 5) - 1\} + b, k and b follow independent uniform distributions over [1, 2].

The censoring times are generated from an independent uniform distribution over [0, c], where c was tuned to yield censoring percentages of 25

Usage

simu(n, cen, scenario, summary = FALSE)

trueHaz(dat)

trueSurv(dat)

Arguments

n

an integer value indicating the number of subjects.

cen

is a numeric value indicating the censoring percentage; three levels, 0%, 25%, 50%, are allowed.

scenario

can be either a numeric value or a character string. This indicates the simulation scenario noted above.

summary

a logical value indicating whether a brief data summary will be printed.

dat

is a data.frame prepared by simu.

Value

simu returns a data.frame. The returned data.frame consists of columns:

id

is the subject id.

Y

is the observed follow-up time.

death

is the death indicator; death = 0 if censored.

z1–z10

is the possible time-independent covariate.

k, b, U

are the latent variables used to generate $Z_1(t)$ in Scenario 2.1 – 2.5.

The returned data.frame can be supply to trueHaz and trueSurv to generate the true cumulative hazard function and the survival function, respectively.

Examples

set.seed(1)
simu(10, 0.25, 1.2, TRUE)

set.seed(1)
simu(10, 0.50, 2.2, TRUE)


[Package rocTree version 1.1.1 Index]