generate_cure_data {hdcuremodels} | R Documentation |
Simulate data under a mixture cure model
Description
Simulate data under a mixture cure model
Usage
generate_cure_data(
N = 400,
J = 500,
nonp = 2,
train.prop = 3/4,
nTrue = 10,
A = 1,
rho = 0.5,
itct_mean = 0.5,
cens_ub = 20,
alpha = 1,
lambda = 2,
same_signs = FALSE,
model = "weibull"
)
Arguments
N |
an integer denoting the total sample size. |
J |
an integer denoting the number of penalized predictors which is the same for both the incidence and latency portions of the model. |
nonp |
an integer less than J denoting the number of unpenalized predictors (which is the same for both the incidence and latency portions of the model. |
train.prop |
a numeric value in 0, 1 representing the fraction of N to be used in forming the Training dataset. |
nTrue |
an integer denoting the number of variables truly associated with the outcome (i.e., the number of covariates with nonzero parameter values) among the penalized predictors. |
A |
a numeric value denoting the effect size which is the same for both the incidence and latency portions of the model. |
rho |
a numeric value in 0, 1 representing the correlation between adjacent covariates in the same block. See details below. |
itct_mean |
a numeric value representing the expectation of the incidence intercept which controls the cure rate. |
cens_ub |
a numeric value representing the upper bound on the censoring time distribition which follows a uniform distribution on 0, |
alpha |
a numeric value representing the shape parameter in the Weibull density. |
lambda |
a numeric value representing the rate parameter in the Weibull density. |
same_signs |
logical, if TRUE the incidence and latency coefficients have the same signs. |
model |
type of regression model to use for the latency portion of mixture cure model. Can be "weibull", "GG", "Gompertz", "nonparametric", or "GG_baseline". |
Value
Training |
Training data.frame which includes Time, Censor, and covariates. |
Testing |
Testing data.frame which includes Time, Censor, and covariates. |
parameters |
A list including: the indices of true incidence signals ( |
Examples
library(survival)
set.seed(1234)
data <- generate_cure_data(N = 200, J = 50, nTrue = 10, A = 1.8, rho = 0.2)
training <- data$Training
testing <- data$Testing
fit <- cureem(Surv(Time, Censor) ~ ., data = training,
x.latency = training, model = "cox", penalty = "lasso",
lambda.inc = 0.05, lambda.lat = 0.05,
gamma.inc = 6, gamma.lat = 10)