cal_validate_logistic {probably} | R Documentation |
Measure performance with and without using logistic calibration
Description
This function uses resampling to measure the effect of calibrating predicted values.
Usage
cal_validate_logistic(
.data,
truth = NULL,
estimate = dplyr::starts_with(".pred_"),
metrics = NULL,
save_pred = FALSE,
...
)
## S3 method for class 'resample_results'
cal_validate_logistic(
.data,
truth = NULL,
estimate = dplyr::starts_with(".pred_"),
metrics = NULL,
save_pred = FALSE,
...
)
## S3 method for class 'rset'
cal_validate_logistic(
.data,
truth = NULL,
estimate = dplyr::starts_with(".pred_"),
metrics = NULL,
save_pred = FALSE,
...
)
## S3 method for class 'tune_results'
cal_validate_logistic(
.data,
truth = NULL,
estimate = NULL,
metrics = NULL,
save_pred = FALSE,
...
)
Arguments
.data |
An |
truth |
The column identifier for the true class results (that is a factor). This should be an unquoted column name. |
estimate |
A vector of column identifiers, or one of |
metrics |
A set of metrics passed created via |
save_pred |
Indicates whether to a column of post-calibration predictions. |
... |
Options to pass to |
Details
These functions are designed to calculate performance with and without calibration. They use resampling to measure out-of-sample effectiveness. There are two ways to pass the data in:
If you have a data frame of predictions, an
rset
object can be created via rsample functions. See the example below.If you have already made a resampling object from the original data and used it with
tune::fit_resamples()
, you can pass that object to the calibration function and it will use the same resampling scheme. If a different resampling scheme should be used, runtune::collect_predictions()
on the object and use the process in the previous bullet point.
Please note that these functions do not apply to tune_result
objects. The
notion of "validation" implies that the tuning parameter selection has been
resolved.
collect_predictions()
can be used to aggregate the metrics for analysis.
Value
The original object with a .metrics_cal
column and, optionally,
an additional .predictions_cal
column. The class cal_rset
is also added.
Performance Metrics
By default, the average of the Brier scores is returned. Any appropriate
yardstick::metric_set()
can be used. The validation function compares the
average of the metrics before, and after the calibration.
See Also
https://www.tidymodels.org/learn/models/calibration/,
cal_estimate_logistic()
Examples
library(dplyr)
# ---------------------------------------------------------------------------
# classification example
segment_logistic %>%
rsample::vfold_cv() %>%
cal_validate_logistic(Class)