predRes {biospear} | R Documentation |
Evaluation of the prediction accuracy of a prediction model
Description
This function computes several criteria to assess the prediction accuracy of a prediction model.
Usage
predRes(res, method, traindata, newdata, int.cv, int.cv.nfold = 5, time,
trace = TRUE, ncores = 1)
## S3 method for class 'predRes'
plot(x, method, crit = c("C", "PE", "dC"),
xlim, ylim, xlab, ylab, col,...)
Arguments
res |
an object of class ' |
method |
methods for which prediction criteria are computed. If missing, all methods contained in |
traindata |
input |
newdata |
input |
int.cv |
logical parameter indicating if a double cross-validation process (2CV) should be performed to mimick an external validation set. |
int.cv.nfold |
number of folds for the double cross-validation. Considering a large value for |
time |
time points to compute the prediction criteria. |
trace |
logical parameter indicating if messages should be printed. |
ncores |
number of CPUs used (for the double cross-validation). |
x |
an object of class ' |
crit |
parameter indicating the criterion for which the results will be printed ( |
xlim , ylim , xlab , ylab , col |
usual parameters for plot. |
... |
other paramaters for plot. |
Details
To evaluate the accuracy of the selected models, three predictive accuracy measures are implemented:
- the integrated Brier score (PE
) to measure the overall prediction error of the prediction model. The time-dependent Brier score is a quadratic score based on the predicted time-dependent survival probability.
- the Uno's C-statistic (C
) to evaluate the discrimination of the prediction model. It's one of the least biased concordance statistic estimator in the presence of censoring (Uno et al., 2011).
- the absolute difference of the treatment-specific Uno's C-statistics (dC
) to evaluate the interaction strength of the prediction model (Ternes et al., 2016).
For simulated datasets, the predictive accuracy metrics are also computed for the "oracle model" that is the unpenalized Cox proportional hazards model fitted to the active biomarkers only.
Value
A list
of the same length of the time
considered. Each element of the list contains between 1 and 3 sublists depending on the chosen validation (i.e. training set [always computed], internal validation through double cross-validation (2CV) [if int.cv
= TRUE
] and/or external validation [if newdata
is provided]). Each sublist is a matrix
containing the predictive accuracy metrics of the implemented methods.
Author(s)
Nils Ternes, Federico Rotolo, and Stefan Michiels
Maintainer: Nils Ternes nils.ternes@yahoo.com
References
Ternes N, Rotolo F and Michiels S.
Empirical extensions of the lasso penalty to reduce
the false discovery rate in high-dimensional Cox regression models.
Statistics in Medicine 2016;35(15):2561-2573.
doi:10.1002/sim.6927
Ternes N, Rotolo F, Heinze G and Michiels S.
Identification of biomarker-by-treatment interactions in randomized
clinical trials with survival outcomes and high-dimensional spaces.
Biometrical journal. In press.
doi:10.1002/bimj.201500234
Uno H, Cai T, Pencina MJ, DAgostino RB and Wei LJ.
On the C-statistics for evaluating overall adequacy
of risk prediction procedures with censored survival data.
Statistics in Medicine 2011;30:1105-1117.
doi:10.1002/sim.4154
Examples
########################################
# Simulated data set
########################################
## Low calculation time
set.seed(654321)
sdata <- simdata(
n = 500, p = 20, q.main = 3, q.inter = 0,
prob.tt = 0.5, alpha.tt = 0,
beta.main = -0.8,
b.corr = 0.6, b.corr.by = 4,
m0 = 5, wei.shape = 1, recr = 4, fu = 2,
timefactor = 1)
newdata <- simdataV(
traindata = sdata,
Nvalid = 500
)
resBM <- BMsel(
data = sdata,
method = c("lasso", "lasso-pcvl"),
inter = FALSE,
folds = 5)
predAcc <- predRes(
res = resBM,
traindata = sdata,
newdata = newdata,
time = 1:5)
plot(predAcc, crit = "C")
## Not run:
## Moderate calculation time
set.seed(123456)
sdata <- simdata(
n = 500, p = 100, q.main = 5, q.inter = 5,
prob.tt = 0.5, alpha.tt = -0.5,
beta.main = c(-0.5, -0.2), beta.inter = c(-0.7, -0.4),
b.corr = 0.6, b.corr.by = 10,
m0 = 5, wei.shape = 1, recr = 4, fu = 2,
timefactor = 1,
active.inter = c("bm003", "bm021", "bm044", "bm049", "bm097"))
resBM <- BMsel(
data = sdata,
method = c("lasso", "lasso-pcvl"),
inter = TRUE,
folds = 5)
predAcc <- predRes(
res = resBM,
traindata = sdata,
int.cv = TRUE,
time = 1:5,
ncores = 5)
plot(predAcc, crit = "dC")
## End(Not run)
########################################
# Breast cancer data set
########################################
## Not run:
data(Breast)
dim(Breast)
set.seed(123456)
resBM <- BMsel(
data = Breast,
x = 4:ncol(Breast),
y = 2:1,
tt = 3,
inter = FALSE,
std.x = TRUE,
folds = 5,
method = c("lasso", "lasso-pcvl"))
summary(resBM)
predAcc <- predRes(
res = resBM,
traindata = Breast,
time = 1:4,
trace = TRUE)
plot(predAcc, crit = "C")
## End(Not run)
########################################
########################################