calibrate {CalibratR}R Documentation

calibrate

Description

Builds selected calibration models on the supplied trainings values actual and predicted and returns them to the user. New test instances can be calibrated using the predict_calibratR function. Returns cross-validated calibration and discrimination error values for the models if evaluate_CV_error is set to TRUE. Repeated cross-Validation can be time-consuming.

Usage

calibrate(actual, predicted, model_idx = c(1, 2, 3, 4, 5),
  evaluate_no_CV_error = TRUE, evaluate_CV_error = TRUE, folds = 10,
  n_seeds = 30, nCores = 4)

Arguments

actual

vector of observed class labels (0/1)

predicted

vector of uncalibrated predictions

model_idx

which calibration models should be implemented, 1=hist_scaled, 2=hist_transformed, 3=BBQ_scaled, 4=BBQ_transformed, 5=GUESS, Default: c(1, 2, 3, 4, 5)

evaluate_no_CV_error

computes internal errors for calibration models that were trained on all available actual/predicted tuples. Testing is performed with the same set. Be careful to interpret those error values, as they are not cross-validated. Default: TRUE

evaluate_CV_error

computes cross-validation error. folds times cross validation is repeated n_seeds times with changing seeds. The trained models and the their calibration and discrimination errors are returned. Evaluation of CV errors can take some time to compute, depending on the number of repetitions specified in n_seeds, Default: TRUE

folds

number of folds in the cross-validation of the calibration model. If folds is set to 1, no CV is performed and summary_CV can be calculated. Default: 10

n_seeds

n_seeds determines how often random data set partition is repeated with varying seed. If folds is 1, n_seeds should be set to 1, too. Default: 30

nCores

nCores how many cores should be used during parallelisation. Default: 4

Details

parallised execution of random data set splits for the Cross-Validation procedure over n_seeds

Value

A list object with the following components:

calibration_models

a list of all trained calibration models, which can be used in the predict_calibratR method.

summary_CV

a list containing information on the CV errors of the implemented models

summary_no_CV

a list containing information on the internal errors of the implemented models

predictions

calibrated predictions for the original predicted values

n_seeds

number of random data set partitions into training and test set for folds-times CV

Author(s)

Johanna Schwarz

Examples

 ## Loading dataset in environment
 data(example)
 actual <- example$actual
 predicted <- example$predicted

 ## Create calibration models
 calibration_model <- calibrate(actual, predicted,
                              model_idx = c(1,2),
                              FALSE, FALSE, folds = 10, n_seeds = 1, nCores = 2)

[Package CalibratR version 0.1.2 Index]