R: Step 3 of PRC-LMM (estimation of the penalized Cox model(s))

fit_prclmm {pencal}

R Documentation

Step 3 of PRC-LMM (estimation of the penalized Cox model(s))

Description

This function performs the third step for the estimation of the PRC-LMM model proposed in Signorelli et al. (2021)

Usage

fit_prclmm(object, surv.data, baseline.covs = NULL, penalty = "ridge",
  standardize = TRUE, pfac.base.covs = 0, cv.seed = 19920207,
  n.alpha.elnet = 11, n.folds.elnet = 5, n.cores = 1, verbose = TRUE)

Arguments

`object`	the output of step 2 of the PRC-LMM procedure, as produced by the `summarize_lmms` function
`surv.data`	a data frame with the survival data and (if relevant) additional baseline covariates. `surv.data` should at least contain a subject id (called `id`), the time to event outcome (`time`), and binary event variable (`event`)
`baseline.covs`	a formula specifying the variables (e.g., baseline age) in `surv.data` that should be included as baseline covariates in the penalized Cox model. Example: `baseline.covs = '~ baseline.age'`. Default is `NULL`
`penalty`	the type of penalty function used for regularization. Default is `'ridge'`, other possible values are `'elasticnet'` and `'lasso'`
`standardize`	logical argument: should the predictors (both baseline covariates and predicted random effects) be standardized when included as covariates in the penalized Cox model? Default is `TRUE`
`pfac.base.covs`	a single value, or a vector of values, indicating whether the baseline covariates (if any) should be penalized (1) or not (0). Default is `pfac.base.covs = 0` (no penalization of all baseline covariates)
`cv.seed`	value of the random seed to use for the cross-validation done to select the optimal value of the tuning parameter
`n.alpha.elnet`	number of alpha values for the two-dimensional grid of tuning parameteres in elasticnet. Only relevant if `penalty = 'elasticnet'`. Default is 11, so that the resulting alpha grid is c(1, 0.9, 0.8, ..., 0.1, 0)
`n.folds.elnet`	number of folds to be used for the selection of the tuning parameter in elasticnet. Only relevant if `penalty = 'elasticnet'`. Default is 5
`n.cores`	number of cores to use to parallelize part of the computations. If `ncores = 1` (default), no parallelization is done. Pro tip: you can use `parallel::detectCores()` to check how many cores are available on your computer
`verbose`	if `TRUE` (default and recommended value), information on the ongoing computations is printed in the console

Value

A list containing the following objects:

call: the function call
pcox.orig: the penalized Cox model fitted on the original dataset;
tuning: the values of the tuning parameter(s) selected through cross-validation
surv.data: the supplied survival data (ordered by subject id)
n.boots: number of bootstrap samples;
boot.ids: a list with the ids of bootstrapped subjects (when n.boots > 0);
pcox.boot: a list where each element is a fitted penalized Cox model for a given bootstrap sample (when n.boots > 0).

Author(s)

Mirko Signorelli

References

Signorelli, M. (2024). pencal: an R Package for the Dynamic Prediction of Survival with Many Longitudinal Predictors. To appear in: The R Journal. Preprint: arXiv:2309.15600

Signorelli, M., Spitali, P., Al-Khalili Szigyarto, C, The MARK-MD Consortium, Tsonaka, R. (2021). Penalized regression calibration: a method for the prediction of survival outcomes using complex longitudinal and high-dimensional data. Statistics in Medicine, 40 (27), 6178-6196. DOI: 10.1002/sim.9178

Examples

# generate example data
set.seed(1234)
p = 4 # number of longitudinal predictors
simdata = simulate_prclmm_data(n = 100, p = p, p.relev = 2, 
             seed = 123, t.values = c(0, 0.2, 0.5, 1, 1.5, 2))
             
# specify options for cluster bootstrap optimism correction
# procedure and for parallel computing 
do.bootstrap = FALSE
# IMPORTANT: set do.bootstrap = TRUE to compute the optimism correction!
n.boots = ifelse(do.bootstrap, 100, 0)
more.cores = FALSE
# IMPORTANT: set more.cores = TRUE to parallelize and speed computations up!
if (!more.cores) n.cores = 1
if (more.cores) {
   # identify number of available cores on your machine
   n.cores = parallel::detectCores()
   if (is.na(n.cores)) n.cores = 8
}

# step 1 of PRC-LMM: estimate the LMMs
y.names = paste('marker', 1:p, sep = '')
step1 = fit_lmms(y.names = y.names, 
                 fixefs = ~ age, ranefs = ~ age | id, 
                 long.data = simdata$long.data, 
                 surv.data = simdata$surv.data,
                 t.from.base = t.from.base,
                 n.boots = n.boots, n.cores = n.cores)
                 
# step 2 of PRC-LMM: compute the summaries 
# of the longitudinal outcomes
step2 = summarize_lmms(object = step1, n.cores = n.cores)

# step 3 of PRC-LMM: fit the penalized Cox models
step3 = fit_prclmm(object = step2, surv.data = simdata$surv.data,
                   baseline.covs = ~ baseline.age,
                   penalty = 'ridge', n.cores = n.cores)
summary(step3)

[Package pencal version 2.2.2 Index]