calplot {glmnetr}R Documentation

Construct calibration plots for a nested.glmnetr output object

Description

Using k-fold cross validation this function constructs calibration plots for a nested.glmnetr output object. Each hold out subset of the k-fold cross validation is regressed on the x*beta predicteds based upon the model fit using the non-hold out data using splines. This yields k spline functions for evaluating model performance. These k spline functions are averaged to provide an overall model calibration. Standard deviations of the k spline fits are also calculated as a function of the predicted X*beta, and these are used to derive and plot approximate 95 (mean +/- 2 * SD/sqrt(k)). Because regression equations can be unreliable when extrapolating beyond the data range used in model derivation, we display this overall calibration fit and CIs with solid lines only for the region which lies within the ranges of the predicted x*betas for all the k leave out sets. The spline fits are made using the same framework as in the original machine learning model fits, i.e. one of "cox", "binomial" or "gaussian"family. For the "cox" famework the pspline() funciton is used, and for the "binomial" and "gaussian" frameworks the ns() function is used. Predicted X*betas beyond the range of any of the hold out sets are displayed by dashed lines to reflect the lessor certainty when extrapolating even for a single hold out set.

Usage

calplot(
  object,
  wbeta = NULL,
  df = 3,
  resample = NULL,
  usefold = 1,
  plot = 1,
  plotfold = 0,
  plothr = 0,
  knottype = 1,
  trim = 0,
  vref = 0,
  xlim = NULL,
  ylim = NULL,
  xlab = NULL,
  ylab = NULL,
  col.term = 1,
  col.se = 2,
  rug = 1,
  seed = NULL,
  cv = NULL,
  fold = NULL,
  ...
)

Arguments

object

A nested.glmnetr() output object for calibration

wbeta

Which Beta should be plotted, an integer. This will depend on which machine learning models were run when creating the output object. If unsure the user can run the function without specifying wbeta and a legend will be directed to the console.

df

The degrees of freedom for the spline function

resample

1 to base the splines on the leave out X*Beta's ($xbetas.cv or $xbetas.boot), or 0 to use the naive X*Beta's ($xbetas). This can be done to see biases associated with the naive approach.

usefold

1 (default) to base the calibration by first fitting splines for each individual fold and then averaging, 0 to base the calibration on a single spline fit using all X*Beta.

plot

1 by default to produce plots, 0 to output data for plots only, 2 to plot and output data.

plotfold

0 by default to not plot the individual fold calibrations, 1 to overlay the k leave out spline calibration fits in a single figure and 2 to produce seperate plots for each of the k hold out calibration curves.

plothr

a power > 1 determining the spacing of the values on the axes, e.g. 2, exp(1), sqrt(10) or 10. The default of 0 plots the X*Beta. This only applies fore "cox" survival data models.

knottype

1 (default) to use OOB (out of bag) test validation data when choosing ns() knots for gaussian and binomial families, 2 to use the larger training data

trim

the percent of top and bottom of the data to be trimmed away when producing plots. The original data are still used used calcualting the curves for plotting.

vref

Similar to trim but instead of trimming the spline lines, plots vertical refence lines aht the top vref and bottom vref percent of the model X*Betas's

xlim

xlim for the plots. This does not effect the curves within the plotted region. Caution, for the "cox" framework the xlim are specified in terms of the X*beta and not the HR, even when HR is described on the axes.

ylim

ylim for the plots, which will usually only be specified in a second run of for the same data. This does not effect the curves within the plotted region. Caution, for the "cox" framework the ylim are specified in terms of the X*beta and not the HR, even when HR is described on the axes.

xlab

a user specified label for the x axis

ylab

a user specified label for the y axis

col.term

a number for the line depicting the overall calibration estimates

col.se

a number for the line depicting the +/- 2 * standard error lines for the overall calibration estimates

rug

1 to plot a rug for the model x*betas, 0 (default) to not.

seed

an integer seed used to random select the multiple of X*Betas to be used in the rug when using bootstraping for model evaluation as sample elements may be included multiple times as test (Out Of Bag) data.

cv

Deprecated. Use resample option instead.

fold

Deprecated. Use instead usefold. fold is too easily confused with a single fold for plotting.

...

allowance to pass terms to the invoked plot function

Details

Optionally, for comparison, the program can fit a spline based upon the predicted x*betas ignoring the cross validation structure, or one can fit a spline using the x*betas calculated using the model based upon all data.

Value

Calibration plots are returned by default, and optionally data for plots are output to a list.

Author(s)

Walter Kremers (kremers.walter@mayo.edu)

See Also

plot.nested.glmnetr , summary.nested.glmnetr , nested.glmnetr


[Package glmnetr version 0.5-2 Index]