generate_cure_data {hdcuremodels}R Documentation

Simulate data under a mixture cure model

Description

Simulate data under a mixture cure model

Usage

generate_cure_data(
  N = 400,
  J = 500,
  nonp = 2,
  train.prop = 3/4,
  nTrue = 10,
  A = 1,
  rho = 0.5,
  itct_mean = 0.5,
  cens_ub = 20,
  alpha = 1,
  lambda = 2,
  same_signs = FALSE,
  model = "weibull"
)

Arguments

N

an integer denoting the total sample size.

J

an integer denoting the number of penalized predictors which is the same for both the incidence and latency portions of the model.

nonp

an integer less than J denoting the number of unpenalized predictors (which is the same for both the incidence and latency portions of the model.

train.prop

a numeric value in 0, 1 representing the fraction of N to be used in forming the Training dataset.

nTrue

an integer denoting the number of variables truly associated with the outcome (i.e., the number of covariates with nonzero parameter values) among the penalized predictors.

A

a numeric value denoting the effect size which is the same for both the incidence and latency portions of the model.

rho

a numeric value in 0, 1 representing the correlation between adjacent covariates in the same block. See details below.

itct_mean

a numeric value representing the expectation of the incidence intercept which controls the cure rate.

cens_ub

a numeric value representing the upper bound on the censoring time distribition which follows a uniform distribution on 0, cens_ub.

alpha

a numeric value representing the shape parameter in the Weibull density.

lambda

a numeric value representing the rate parameter in the Weibull density.

same_signs

logical, if TRUE the incidence and latency coefficients have the same signs.

model

type of regression model to use for the latency portion of mixture cure model. Can be "weibull", "GG", "Gompertz", "nonparametric", or "GG_baseline".

Value

Training

Training data.frame which includes Time, Censor, and covariates.

Testing

Testing data.frame which includes Time, Censor, and covariates.

parameters

A list including: the indices of true incidence signals (nonzero_b), indices of true latency signals (nonzero_beta), unpenalized incidence parameter values (b_u), unpenalized latency parameter values (beta_u), parameter values for the true incidence signals among penalized covariates (b_p_nz), parameter values for the true latency signals among penalized covariates (beta_p_nz), parameter value for the incidence intercept (itct)

Examples

library(survival)
set.seed(1234)
data <- generate_cure_data(N = 200, J = 50, nTrue = 10, A = 1.8, rho = 0.2)
training <- data$Training
testing <- data$Testing
fit <- cureem(Surv(Time, Censor) ~ ., data = training,
              x.latency = training, model = "cox", penalty = "lasso",
              lambda.inc = 0.05, lambda.lat = 0.05,
              gamma.inc = 6, gamma.lat = 10)

[Package hdcuremodels version 0.0.1 Index]