R: Evaluation of the selection accuracy of a prediction model

selRes {biospear}

R Documentation

Evaluation of the selection accuracy of a prediction model

Description

This function computes several criteria to assess the selection accuracy of a prediction model. Of note, this function is only available for simulated data sets for which true biomarkers are known.

Usage

selRes(res)

Arguments

res

an object of class 'resBMsel' generated by BMsel with data simulated using simdata.

Details

Based on the 2x2 contingency table (active vs. inactive / selected vs. unselected), four selection criteria are provided:
- the false discovery rate (FDR) that is the proportion of inactive biomarkers among the selected ones,
- the false non-discovery rate (FNDR) that is the proportion of active biomarkers among the unselected ones,
- the false negative rate (FNR) that is the proportion of unselected biomarkers among the active ones,
- the false positive rate (FPR) that is the proportion of selected biomarkers among the inactive ones.
These four criteria are between 0 and 1, and must be minimized.
We also provided two discrimination criteria, translating the ability to discard inactive biomarkers more likely than active ones independently of the tuning parameters:
- the area under the ROC curve (AUC) depending on the sensitivity [1 - FNR] and specificity [1 - FPR],
- the area under the precision-recall curve (AUPRC) depending on the FNR and FDR (Davis and Goadrich, 2006).
Of note, the AUPRC is more meaningful than the AUC when there are many more inactive than active biomarkers. These two criteria are between 0 and 1, and must be maximized.

Value

A matrix of dimension 6 x the number of methods used to fit res.

Author(s)

Nils Ternes, Federico Rotolo, and Stefan Michiels
Maintainer: Nils Ternes nils.ternes@yahoo.com

References

Davis J and Goadrich M. The relationship between Precision-Recall and ROC curves. Proceedings of the 23rd International Conference on Machine Learning. ACM, Pittsburgh PA, 233-240.
Ternes N, Rotolo F and Michiels S. Empirical extensions of the lasso penalty to reduce the false discovery rate in high-dimensional Cox regression models. Statistics in Medicine 2016;35(15):2561-2573. doi:10.1002/sim.6927
Ternes N, Rotolo F, Heinze G and Michiels S. Identification of biomarker-by-treatment interactions in randomized clinical trials with survival outcomes and high-dimensional spaces. Biometrical journal. In press. doi:10.1002/bimj.201500234

Examples

########################################
# Simulated data set
########################################

## Low calculation time
  set.seed(654321)
  sdata <- simdata(
    n = 500, p = 20, q.main = 3, q.inter = 0,
    prob.tt = 0.5, alpha.tt = 0,
    beta.main = -0.8,
    b.corr = 0.6, b.corr.by = 4,
    m0 = 5, wei.shape = 1, recr = 4, fu = 2,
    timefactor = 1)

  resBM <- BMsel(
    data = sdata, 
    method = c("lasso", "lasso-pcvl"), 
    inter = FALSE, 
    folds = 5)
  
  selAcc <- selRes(resBM)

## Not run: 
## Moderate calculation time
  set.seed(123456)
  sdata <- simdata(
    n = 500, p = 100, q.main = 5, q.inter = 5,
    prob.tt = 0.5, alpha.tt = -0.5,
    beta.main = c(-0.5, -0.2), beta.inter = c(-0.7, -0.4),
    b.corr = 0.6, b.corr.by = 10,
    m0 = 5, wei.shape = 1, recr = 4, fu = 2,
    timefactor = 1,
    active.inter = c("bm003", "bm021", "bm044", "bm049", "bm097"))

  resBM <- BMsel(
    data = sdata, 
    method = c("lasso", "lasso-pcvl"), 
    inter = TRUE, 
    folds = 5)
  
  selAcc <- selRes(resBM)

## End(Not run)

[Package biospear version 1.0.2 Index]