R: Calculate cross-validated decision curves

cv_decision_curve {rmda}

R Documentation

Calculate cross-validated decision curves

Description

This is a wrapper for 'decision_curve' that computes k-fold cross-validated estimates of sensitivity, specificity, and net benefit so that cross-validated net benefit curves can be plotted.

Usage

cv_decision_curve(formula, data, family = binomial(link = "logit"),
  thresholds = seq(0, 1, by = 0.01), folds = 5, study.design = c("cohort",
  "case-control"), population.prevalence, policy = c("opt-in", "opt-out"))

Arguments

`formula`	an object of class 'formula' of the form outcome ~ predictors, giving the prediction model to be fitted using glm. The outcome must be a binary variable that equals '1' for cases and '0' for controls.
`data`	data.frame containing outcome and predictors. Missing data on any of the predictors will cause the entire observation to be removed.
`family`	a description of the error distribution and link function to pass to 'glm" used for model fitting. Defaults to binomial(link = "logit") for logistic regression.
`thresholds`	Numeric vector of high risk thresholds to use when plotting and calculating net benefit values.
`folds`	Number of folds for k-fold cross-validation.
`study.design`	Either 'cohort' (default) or 'case-control' describing the study design used to obtain data. See details for more information.
`population.prevalence`	Outcome prevalence rate in the population used to calculate decision curves when study.design = 'case-control'.
`policy`	Either 'opt-in' (default) or 'opt-out', describing the type of policy for which to report the net benefit. A policy is 'opt-in' when the standard-of-care for a population is to assign a particular 'treatment' to no one. Clinicians then use a risk model to categorize patients as 'high-risk', with the recommendation to treat high-risk patients with some intervention. Alternatively, an 'opt-out' policy is applicable to contexts where the standard-of-care is to recommend a treatment to an entire patient population. The potential use of a risk model in this setting is to identify patients who are 'low-risk' and recommend that those patients 'opt-out' of treatment.

Value

List with components

derived.data: derived.data: A data frame in long form showing the following for each predictor and each 'threshold', 'FPR':false positive rate, 'TPR': true positive rate, 'NB': net benefit, 'sNB': standardized net benefit, 'rho': outcome prevalence, 'prob.high.risk': percent of the population considered high risk. 'DP': detection probability = TPR*rho, 'model': name of prediction model or 'all' or 'none', and cost.benefit.ratio's.
folds: number of folds used for cross-validation.
call: matched function call.

Examples


full.model_cv <- cv_decision_curve(Cancer~Age + Female + Smokes + Marker1 + Marker2,
                                  data = dcaData,
                                  folds = 5,
                                  thresholds = seq(0, .4, by = .01))

full.model_apparent <- decision_curve(Cancer~Age + Female + Smokes + Marker1 + Marker2,
                                     data = dcaData,
                                     thresholds = seq(0, .4, by = .01),
                                     confidence.intervals = 'none')

plot_decision_curve( list(full.model_apparent, full.model_cv),
                    curve.names = c('Apparent curve', 'Cross-validated curve'),
                    col = c('red', 'blue'),
                    lty = c(2,1),
                    lwd = c(3,2, 2, 1),
                    legend.position = 'bottomright')