R: Simulate data under a mixture cure model

generate_cure_data {hdcuremodels}

R Documentation

Simulate data under a mixture cure model

Description

Simulate data under a mixture cure model

Usage

generate_cure_data(
  N = 400,
  J = 500,
  nonp = 2,
  train.prop = 3/4,
  nTrue = 10,
  A = 1,
  rho = 0.5,
  itct_mean = 0.5,
  cens_ub = 20,
  alpha = 1,
  lambda = 2,
  same_signs = FALSE,
  model = "weibull"
)

Arguments

`N`	an integer denoting the total sample size.
`J`	an integer denoting the number of penalized predictors which is the same for both the incidence and latency portions of the model.
`nonp`	an integer less than J denoting the number of unpenalized predictors (which is the same for both the incidence and latency portions of the model.
`train.prop`	a numeric value in 0, 1 representing the fraction of N to be used in forming the Training dataset.
`nTrue`	an integer denoting the number of variables truly associated with the outcome (i.e., the number of covariates with nonzero parameter values) among the penalized predictors.
`A`	a numeric value denoting the effect size which is the same for both the incidence and latency portions of the model.
`rho`	a numeric value in 0, 1 representing the correlation between adjacent covariates in the same block. See details below.
`itct_mean`	a numeric value representing the expectation of the incidence intercept which controls the cure rate.
`cens_ub`	a numeric value representing the upper bound on the censoring time distribition which follows a uniform distribution on 0, `cens_ub`.
`alpha`	a numeric value representing the shape parameter in the Weibull density.
`lambda`	a numeric value representing the rate parameter in the Weibull density.
`same_signs`	logical, if TRUE the incidence and latency coefficients have the same signs.
`model`	type of regression model to use for the latency portion of mixture cure model. Can be "weibull", "GG", "Gompertz", "nonparametric", or "GG_baseline".

Value

`Training`	Training data.frame which includes Time, Censor, and covariates.
`Testing`	Testing data.frame which includes Time, Censor, and covariates.
`parameters`	A list including: the indices of true incidence signals (`nonzero_b`), indices of true latency signals (`nonzero_beta`), unpenalized incidence parameter values (`b_u`), unpenalized latency parameter values (`beta_u`), parameter values for the true incidence signals among penalized covariates (`b_p_nz`), parameter values for the true latency signals among penalized covariates (`beta_p_nz`), parameter value for the incidence intercept (`itct`)

Examples

library(survival)
set.seed(1234)
data <- generate_cure_data(N = 200, J = 50, nTrue = 10, A = 1.8, rho = 0.2)
training <- data$Training
testing <- data$Testing
fit <- cureem(Surv(Time, Censor) ~ ., data = training,
              x.latency = training, model = "cox", penalty = "lasso",
              lambda.inc = 0.05, lambda.lat = 0.05,
              gamma.inc = 6, gamma.lat = 10)

[Package hdcuremodels version 0.0.1 Index]