R: Penalized Single Predictor (PSP) S4-class as a member of...

PSP-class {ePCR}

R Documentation

Penalized Single Predictor (PSP) S4-class as a member of PEP-ensembles

Description

PSP is a single penalized Cox regression model, where an alpha/lambda grid has been optimized using cross-validation and a chosen prediction metric. PSPs are single entities that will compile together into PEPs, the ensemble objects that will average over multiple PSPs to generate an ensemble prediction. Typically a single PSP models a part of the data, such as a cohort strata.

Slots

description: A general user-provided string describing the PSP
features: A character vector indicating feature names
strata: Information whether data matrix x included substrata (will be used in plotting functions etc)
alphaseq: The sequence of alpha values to test, ranging between [0,1]; alpha = 0 being ridge regression, 0 < alpha < 1 being elastic net and alpha = 1 being LASSO
cvfolds: The number of cross-validation folds to utilize; by default 10
nlambda: The amount of lambda values utilized in each regularization path; by default 100 as in glmnet-package
cvmean: A matrix indicating the mean CV performance in alpha/lambda grid (preferred over median)
cvmedian: A matrix indicating the median CV performance in alpha/lambda grid
cvstdev: A matrix indicating the standard deviation in CV performance over the folds in the alpha/lambda grid
cvmin: A matrix indicating minimum CV performance in alpha/lambda grid
cvmax: A matrix indicating maximum CV performance in alpha/lambda grid
score: The scoring function, user-defined or one provided by ePCR package such as score.cindex or score.iAUC
cvrepeat: Number of cross-validation procedures to run multiple times and then average over, in order to reduce the effect of binning samples
impute: The imputation function used if provided matrix 'x' includes missing values; by default the impute.knn-function from BioConductor package 'impute'
optimum: The optimum in alpha/lambda grid, with optimal alpha and similarly for lambda
seed: The initial random seed used for cross-validation
x: The input data matrix
x.expand: A function that allows expansion of matrix 'x' to include interactions between variables; if no such are desired, this should be an identity function
y: The Surv-object as in survival-package, which serves as the response y
fit: The glmnet coxnet-object obtained with optimal alpha
criterion: The optimizing criterion; by default "min" for minimizing CV-error
dictionary: A list of discriptions for each variable
regAUC: A numeric vector for the AUC under regularization curve as computed by integrateRegCurve-function

Examples

# As an example, illustrate a naive PSP built on the small medication cohort
data(TYKSSIMU)
library(survival)
# Minimal example with much fewer patients and variables
psp_ex <- new("PSP", alphaseq=c(0.2, 0.8), nlambda=20, folds=3,
	x = xMEDISIMU[1:80,c(1:20,40:50)], y = yMEDISIMU[1:80,"surv"],
seeds = 1, score=score.cindex)

plot(psp_ex) # Optimization surface of alpha/lambda

# Illustrate the use of some PSP-methods:
PSP.KM(psp_ex, cutoff = 0.5) # Kaplan-Meier
PSP.PCA(psp_ex) # PCA plot of training data
PSP.BOX(psp_ex) # Boxplots, here for the first training variable
PSP.CSP(psp_ex) # Cumulative survival probabilities for the training data
invisible(PSP.NA(psp_ex)) # Time-to-event Nelson-Aalen heuristic algorithm

## Not run: 
# Computationally intensive novel PSP-fitting is omitted from the test runs
# Functions for readily fitted PSP-objects are illustrated above
data(TYKSSIMU)
library(survival)
psp_meditext <- new("PSP", x = rbind(xMEDISIMU, xTEXTSIMU), 
y = Surv(rbind(yMEDISIMU, yTEXTSIMU)[,"surv"]),
plot = TRUE, alphaseq = seq(0, 1, by=.01), scorefunc = score.cindex, 
seed = 1, folds = 10, nlambda = 100)
plot(psp_meditext)

## End(Not run)

[Package ePCR version 0.11.0 Index]