pencox {pencal}R Documentation

Estimation of a penalized Cox model with time-independent covariates

Description

This function estimates a penalized Cox model where only time-independent covariates are included as predictors, and then computes a bootstrap optimism correction procedure that is used to validate the predictive performance of the model

Usage

pencox(data, formula, penalty = "ridge", standardize = TRUE,
  penalty.factor = 1, n.alpha.elnet = 11, n.folds.elnet = 5,
  n.boots = 0, n.cores = 1, verbose = TRUE)

Arguments

data

a data frame with one row for each subject.It should at least contain a subject id (called id), the time to event outcome (time), and the binary censoring indicator (event), plus at least one covariate to be included in the linear predictor

formula

a formula specifying the variables in data to include as predictors in the penalized Cox model

penalty

the type of penalty function used for regularization. Default is 'ridge', other possible values are 'elasticnet' and 'lasso'

standardize

logical argument: should the covariates be standardized when included in the penalized Cox model? Default is TRUE

penalty.factor

a single value, or a vector of values, indicating whether the covariates (if any) should be penalized (1) or not (0). Default is penalty.factor = 1

n.alpha.elnet

number of alpha values for the two-dimensional grid of tuning parameteres in elasticnet. Only relevant if penalty = 'elasticnet'. Default is 11, so that the resulting alpha grid is c(1, 0.9, 0.8, ..., 0.1, 0)

n.folds.elnet

number of folds to be used for the selection of the tuning parameter in elasticnet. Only relevant if penalty = 'elasticnet'. Default is 5

n.boots

number of bootstrap samples to be used in the bootstrap optimism correction procedure. If 0, no bootstrapping is performed

n.cores

number of cores to use to parallelize the computation of the CBOCP. If ncores = 1 (default), no parallelization is done. Pro tip: you can use parallel::detectCores() to check how many cores are available on your computer

verbose

if TRUE (default and recommended value), information on the ongoing computations is printed in the console

Value

A list containing the following objects:

Author(s)

Mirko Signorelli

References

Signorelli, M. (2024). pencal: an R Package for the Dynamic Prediction of Survival with Many Longitudinal Predictors. To appear in: The R Journal. Preprint: arXiv:2309.15600

Signorelli, M., Spitali, P., Al-Khalili Szigyarto, C, The MARK-MD Consortium, Tsonaka, R. (2021). Penalized regression calibration: a method for the prediction of survival outcomes using complex longitudinal and high-dimensional data. Statistics in Medicine, 40 (27), 6178-6196. DOI: 10.1002/sim.9178

See Also

fit_prclmm, fit_prcmlpmm

Examples

# generate example data
set.seed(1234)
p = 4 # number of longitudinal predictors
simdata = simulate_prclmm_data(n = 100, p = p, p.relev = 2, 
             seed = 123, t.values = c(0, 0.2, 0.5, 1, 1.5, 2))
#create dataframe with baseline measurements only
baseline.visits = simdata$long.data[which(!duplicated(simdata$long.data$id)),]
df = merge(simdata$surv.data, baseline.visits, by = 'id')
df = df[ , -c(5:6)]

do.bootstrap = FALSE
# IMPORTANT: set do.bootstrap = TRUE to compute the optimism correction!
n.boots = ifelse(do.bootstrap, 100, 0)
more.cores = FALSE
# IMPORTANT: set more.cores = TRUE to speed computations up!
if (!more.cores) n.cores = 2
if (more.cores) {
   # identify number of available cores on your machine
   n.cores = parallel::detectCores()
   if (is.na(n.cores)) n.cores = 2
}

form = as.formula(~ baseline.age + marker1 + marker2
                     + marker3 + marker4)
base.pcox = pencox(data = df, 
              formula = form, 
              n.boots = n.boots, n.cores = n.cores) 
ls(base.pcox)

[Package pencal version 2.2.2 Index]