valProbggplot {CalibrationCurves} | R Documentation |
Calibration performance: ggplot version
Description
The function valProbggplot
is an adaptation of val.prob
from Frank Harrell's rms package,
https://cran.r-project.org/package=rms. Hence, the description of some of the functions of valProbggplot
come from the the original val.prob
.
The key feature of valProbggplot
is the generation of logistic and flexible calibration curves and related statistics.
When using this code, please cite: Van Calster, B., Nieboer, D., Vergouwe, Y., De Cock, B., Pencina, M.J., Steyerberg,
E.W. (2016). A calibration hierarchy for risk models was defined: from utopia to empirical data. Journal of Clinical Epidemiology,
74, pp. 167-176
Usage
valProbggplot(
p,
y,
logit,
group,
weights = rep(1, length(y)),
normwt = FALSE,
pl = TRUE,
smooth = c("loess", "rcs", "none"),
CL.smooth = "fill",
CL.BT = FALSE,
lty.smooth = 1,
col.smooth = "black",
lwd.smooth = 1,
nr.knots = 5,
logistic.cal = FALSE,
lty.log = 1,
col.log = "black",
lwd.log = 1,
xlab = "Predicted probability",
ylab = "Observed proportion",
xlim = c(-0.02, 1),
ylim = c(-0.15, 1),
m,
g,
cuts,
emax.lim = c(0, 1),
legendloc = c(0.5, 0.27),
statloc = c(0, 0.85),
dostats = TRUE,
cl.level = 0.95,
method.ci = "pepe",
roundstats = 2,
riskdist = "predicted",
size = 3,
size.leg = 5,
connect.group = FALSE,
connect.smooth = TRUE,
g.group = 4,
evaluate = 100,
nmin = 0,
d0lab = "0",
d1lab = "1",
size.d01 = 5,
dist.label = 0.01,
line.bins = -0.05,
dist.label2 = 0.04,
cutoff,
length.seg = 0.85,
lty.ideal = 1,
col.ideal = "red",
lwd.ideal = 1,
allowPerfectPredictions = FALSE,
argzLoess = alist(degree = 2)
)
Arguments
p |
predicted probability |
y |
vector of binary outcomes |
logit |
predicted log odds of outcome. Specify either |
group |
a grouping variable. If numeric this variable is grouped into
|
weights |
an optional numeric vector of per-observation weights (usually frequencies),
used only if |
normwt |
set to |
pl |
|
smooth |
|
CL.smooth |
|
CL.BT |
|
lty.smooth |
the linetype of the flexible calibration curve. Default is |
col.smooth |
the color of the flexible calibration curve. Default is |
lwd.smooth |
the line width of the flexible calibration curve. Default is |
nr.knots |
specifies the number of knots for rcs-based calibration curve. The default as well as the highest allowed value is 5. In case the specified number of knots leads to estimation problems, then the number of knots is automatically reduced to the closest value without estimation problems. |
logistic.cal |
|
lty.log |
if |
col.log |
if |
lwd.log |
if |
xlab |
x-axis label, default is |
ylab |
y-axis label, default is |
xlim , ylim |
numeric vectors of length 2, giving the x and y coordinates ranges (see |
m |
If grouped proportions are desired, average no. observations per group |
g |
If grouped proportions are desired, number of quantile groups |
cuts |
If grouped proportions are desired, actual cut points for constructing
intervals, e.g. |
emax.lim |
Vector containing lowest and highest predicted probability over which to
compute |
legendloc |
if |
statloc |
the "abc" of model performance (Steyerberg et al., 2011)-calibration intercept, calibration slope,
and c statistic-will be added to the plot, using statloc as the upper left corner of a box (default is c(0,.85).
You can specify a list or a vector. Use locator(1) for the mouse, |
dostats |
specifies whether and which performance measures are shown in the figure.
|
cl.level |
if |
method.ci |
method to calculate the confidence interval of the c-statistic. The argument is passed to |
roundstats |
specifies the number of decimals to which the statistics are rounded when shown in the plot. Default is 2. |
riskdist |
Use |
size , size.leg |
controls the font size of the statistics ( |
connect.group |
Defaults to |
connect.smooth |
Defaults to |
g.group |
number of quantile groups to use when |
evaluate |
number of points at which to store the |
nmin |
applies when |
d0lab , d1lab |
controls the labels for events and non-events (i.e. outcome y) for the histograms.
Defaults are |
size.d01 |
controls the size of the labels for events and non-events. Default is 5. |
dist.label |
controls the horizontal position of the labels for events and non-events. Default is 0.01. |
line.bins |
controls the horizontal (y-axis) position of the histograms. Default is -0.05. |
dist.label2 |
controls the vertical distance between the labels for events and non-events. Default is 0.03. |
cutoff |
puts an arrow at the specified risk cut-off(s). Default is none. |
length.seg |
controls the length of the histogram lines. Default is |
lty.ideal |
linetype of the ideal line. Default is |
col.ideal |
controls the color of the ideal line on the plot. Default is |
lwd.ideal |
controls the line width of the ideal line on the plot. Default is |
allowPerfectPredictions |
Logical, indicates whether perfect predictions (i.e. values of either 0 or 1) are allowed. Default is |
argzLoess |
a list with arguments passed to the |
Details
When using the predicted probabilities of an uninformative model (i.e. equal probabilities for all observations), the model has no predictive value. Consequently, where applicable, the value of the performance measure corresponds to the worst possible theoretical value. For the ECI, for example, this equals 1 (Edlinger et al., 2022).
Value
An object of type ggplotCalibrationCurve
with the following slots:
call |
the matched call. |
ggPlot |
the ggplot object. |
stats |
a vector containing performance measures of calibration. |
cl.level |
the confidence level used. |
Calibration |
contains the calibration intercept and slope, together with their confidence intervals. |
Cindex |
the value of the c-statistic, together with its confidence interval. |
warningMessages |
if any, the warning messages that were printed while running the function. |
CalibrationCurves |
The coordinates for plotting the calibration curves. |
Note
In order to make use (of the functions) of the package auRoc, the user needs to install JAGS. However, since our package only uses the
auc.nonpara.mw
function which does not depend on the use of JAGS, we therefore copied the code and slightly adjusted it when
method="pepe"
.
References
Edlinger, M, van Smeden, M, Alber, HF, Wanitschek, M, Van Calster, B. (2022). Risk prediction models for discrete ordinal outcomes: Calibration and the impact of the proportional odds assumption. Statistics in Medicine, 41( 8), pp. 1334– 1360
Qin, G., & Hotilovac, L. (2008). Comparison of non-parametric confidence intervals for the area under the ROC curve of a continuous-scale diagnostic test. Statistical Methods in Medical Research, 17(2), pp. 207-21
Steyerberg, E.W., Van Calster, B., Pencina, M.J. (2011). Performance measures for prediction models and markers : evaluation of predictions and classifications. Revista Espanola de Cardiologia, 64(9), pp. 788-794
Van Calster, B., Nieboer, D., Vergouwe, Y., De Cock, B., Pencina M., Steyerberg E.W. (2016). A calibration hierarchy for risk models was defined: from utopia to empirical data. Journal of Clinical Epidemiology, 74, pp. 167-176
Van Hoorde, K., Van Huffel, S., Timmerman, D., Bourne, T., Van Calster, B. (2015). A spline-based tool to assess and visualize the calibration of multiclass risk predictions. Journal of Biomedical Informatics, 54, pp. 283-93
Examples
# Load package
library(CalibrationCurves)
set.seed(1783)
# Simulate training data
X = replicate(4, rnorm(5e2))
p0true = binomial()$linkinv(cbind(1, X) %*% c(0.1, 0.5, 1.2, -0.75, 0.8))
y = rbinom(5e2, 1, p0true)
Df = data.frame(y, X)
# Fit logistic model
FitLog = lrm(y ~ ., Df)
# Simulate validation data
Xval = replicate(4, rnorm(5e2))
p0true = binomial()$linkinv(cbind(1, Xval) %*% c(0.1, 0.5, 1.2, -0.75, 0.8))
yval = rbinom(5e2, 1, p0true)
Pred = binomial()$linkinv(cbind(1, Xval) %*% coef(FitLog))
# Default calibration plot
valProbggplot(Pred, yval)
# Adding logistic calibration curves and other additional features
valProbggplot(Pred, yval, CL.smooth = TRUE, logistic.cal = TRUE, lty.log = 2,
col.log = "red", lwd.log = 1.5)
valProbggplot(Pred, yval, CL.smooth = TRUE, logistic.cal = TRUE, lty.log = 9,
col.log = "red", lwd.log = 1.5, col.ideal = colors()[10], lwd.ideal = 0.5)