fit_prclmm {pencal}R Documentation

Step 3 of PRC-LMM (estimation of the penalized Cox model(s))

Description

This function performs the third step for the estimation of the PRC-LMM model proposed in Signorelli et al. (2021)

Usage

fit_prclmm(object, surv.data, baseline.covs = NULL, penalty = "ridge",
  standardize = TRUE, pfac.base.covs = 0, cv.seed = 19920207,
  n.alpha.elnet = 11, n.folds.elnet = 5, n.cores = 1, verbose = TRUE)

Arguments

object

the output of step 2 of the PRC-LMM procedure, as produced by the summarize_lmms function

surv.data

a data frame with the survival data and (if relevant) additional baseline covariates. surv.data should at least contain a subject id (called id), the time to event outcome (time), and binary event variable (event)

baseline.covs

a formula specifying the variables (e.g., baseline age) in surv.data that should be included as baseline covariates in the penalized Cox model. Example: baseline.covs = '~ baseline.age'. Default is NULL

penalty

the type of penalty function used for regularization. Default is 'ridge', other possible values are 'elasticnet' and 'lasso'

standardize

logical argument: should the predictors (both baseline covariates and predicted random effects) be standardized when included as covariates in the penalized Cox model? Default is TRUE

pfac.base.covs

a single value, or a vector of values, indicating whether the baseline covariates (if any) should be penalized (1) or not (0). Default is pfac.base.covs = 0 (no penalization of all baseline covariates)

cv.seed

value of the random seed to use for the cross-validation done to select the optimal value of the tuning parameter

n.alpha.elnet

number of alpha values for the two-dimensional grid of tuning parameteres in elasticnet. Only relevant if penalty = 'elasticnet'. Default is 11, so that the resulting alpha grid is c(1, 0.9, 0.8, ..., 0.1, 0)

n.folds.elnet

number of folds to be used for the selection of the tuning parameter in elasticnet. Only relevant if penalty = 'elasticnet'. Default is 5

n.cores

number of cores to use to parallelize part of the computations. If ncores = 1 (default), no parallelization is done. Pro tip: you can use parallel::detectCores() to check how many cores are available on your computer

verbose

if TRUE (default and recommended value), information on the ongoing computations is printed in the console

Value

A list containing the following objects:

Author(s)

Mirko Signorelli

References

Signorelli, M. (2023). pencal: an R Package for the Dynamic Prediction of Survival with Many Longitudinal Predictors. arXiv preprint: arXiv:2309.15600

Signorelli, M., Spitali, P., Al-Khalili Szigyarto, C, The MARK-MD Consortium, Tsonaka, R. (2021). Penalized regression calibration: a method for the prediction of survival outcomes using complex longitudinal and high-dimensional data. Statistics in Medicine, 40 (27), 6178-6196. DOI: 10.1002/sim.9178

See Also

fit_lmms (step 1), summarize_lmms (step 2), performance_prc

Examples

# generate example data
set.seed(1234)
p = 4 # number of longitudinal predictors
simdata = simulate_prclmm_data(n = 100, p = p, p.relev = 2, 
             seed = 123, t.values = c(0, 0.2, 0.5, 1, 1.5, 2))
             
# specify options for cluster bootstrap optimism correction
# procedure and for parallel computing 
do.bootstrap = FALSE
# IMPORTANT: set do.bootstrap = TRUE to compute the optimism correction!
n.boots = ifelse(do.bootstrap, 100, 0)
more.cores = FALSE
# IMPORTANT: set more.cores = TRUE to parallelize and speed computations up!
if (!more.cores) n.cores = 1
if (more.cores) {
   # identify number of available cores on your machine
   n.cores = parallel::detectCores()
   if (is.na(n.cores)) n.cores = 8
}

# step 1 of PRC-LMM: estimate the LMMs
y.names = paste('marker', 1:p, sep = '')
step1 = fit_lmms(y.names = y.names, 
                 fixefs = ~ age, ranefs = ~ age | id, 
                 long.data = simdata$long.data, 
                 surv.data = simdata$surv.data,
                 t.from.base = t.from.base,
                 n.boots = n.boots, n.cores = n.cores)
                 
# step 2 of PRC-LMM: compute the summaries 
# of the longitudinal outcomes
step2 = summarize_lmms(object = step1, n.cores = n.cores)

# step 3 of PRC-LMM: fit the penalized Cox models
step3 = fit_prclmm(object = step2, surv.data = simdata$surv.data,
                   baseline.covs = ~ baseline.age,
                   penalty = 'ridge', n.cores = n.cores)
summary(step3)                    

[Package pencal version 2.2.1 Index]