selRes {biospear} | R Documentation |
Evaluation of the selection accuracy of a prediction model
Description
This function computes several criteria to assess the selection accuracy of a prediction model. Of note, this function is only available for simulated data sets for which true biomarkers are known.
Usage
selRes(res)
Arguments
res |
an object of class ' |
Details
Based on the 2x2 contingency table (active vs. inactive / selected vs. unselected),
four selection criteria are provided:
- the false discovery rate (FDR
) that is the proportion of inactive biomarkers
among the selected ones,
- the false non-discovery rate (FNDR
) that is the proportion of active biomarkers
among the unselected ones,
- the false negative rate (FNR
) that is the proportion of unselected
biomarkers among the active ones,
- the false positive rate (FPR
) that is the proportion of selected
biomarkers among the inactive ones.
These four criteria are between 0 and 1, and must be minimized.
We also provided two discrimination criteria, translating the ability to discard inactive
biomarkers more likely than active ones independently of the tuning parameters:
- the area under the ROC curve (AUC
) depending on the sensitivity [1 - FNR] and specificity [1 - FPR],
- the area under the precision-recall curve (AUPRC
) depending on the FNR and FDR (Davis and Goadrich, 2006).
Of note, the AUPRC is more meaningful than the AUC when there are many more inactive than active biomarkers.
These two criteria are between 0 and 1, and must be maximized.
Value
A matrix
of dimension 6 x the number of methods used to fit res
.
Author(s)
Nils Ternes, Federico Rotolo, and Stefan Michiels
Maintainer: Nils Ternes nils.ternes@yahoo.com
References
Davis J and Goadrich M.
The relationship between Precision-Recall and ROC curves.
Proceedings of the 23rd International Conference on Machine Learning.
ACM, Pittsburgh PA, 233-240.
Ternes N, Rotolo F and Michiels S.
Empirical extensions of the lasso penalty to reduce
the false discovery rate in high-dimensional Cox regression models.
Statistics in Medicine 2016;35(15):2561-2573.
doi:10.1002/sim.6927
Ternes N, Rotolo F, Heinze G and Michiels S.
Identification of biomarker-by-treatment interactions in randomized
clinical trials with survival outcomes and high-dimensional spaces.
Biometrical journal. In press.
doi:10.1002/bimj.201500234
Examples
########################################
# Simulated data set
########################################
## Low calculation time
set.seed(654321)
sdata <- simdata(
n = 500, p = 20, q.main = 3, q.inter = 0,
prob.tt = 0.5, alpha.tt = 0,
beta.main = -0.8,
b.corr = 0.6, b.corr.by = 4,
m0 = 5, wei.shape = 1, recr = 4, fu = 2,
timefactor = 1)
resBM <- BMsel(
data = sdata,
method = c("lasso", "lasso-pcvl"),
inter = FALSE,
folds = 5)
selAcc <- selRes(resBM)
## Not run:
## Moderate calculation time
set.seed(123456)
sdata <- simdata(
n = 500, p = 100, q.main = 5, q.inter = 5,
prob.tt = 0.5, alpha.tt = -0.5,
beta.main = c(-0.5, -0.2), beta.inter = c(-0.7, -0.4),
b.corr = 0.6, b.corr.by = 10,
m0 = 5, wei.shape = 1, recr = 4, fu = 2,
timefactor = 1,
active.inter = c("bm003", "bm021", "bm044", "bm049", "bm097"))
resBM <- BMsel(
data = sdata,
method = c("lasso", "lasso-pcvl"),
inter = TRUE,
folds = 5)
selAcc <- selRes(resBM)
## End(Not run)