R: calibrate

calibrate {CalibratR}

R Documentation

calibrate

Description

Builds selected calibration models on the supplied trainings values actual and predicted and returns them to the user. New test instances can be calibrated using the predict_calibratR function. Returns cross-validated calibration and discrimination error values for the models if evaluate_CV_error is set to TRUE. Repeated cross-Validation can be time-consuming.

Usage

calibrate(actual, predicted, model_idx = c(1, 2, 3, 4, 5),
  evaluate_no_CV_error = TRUE, evaluate_CV_error = TRUE, folds = 10,
  n_seeds = 30, nCores = 4)

Arguments

`actual`	vector of observed class labels (0/1)
`predicted`	vector of uncalibrated predictions
`model_idx`	which calibration models should be implemented, 1=hist_scaled, 2=hist_transformed, 3=BBQ_scaled, 4=BBQ_transformed, 5=GUESS, Default: c(1, 2, 3, 4, 5)
`evaluate_no_CV_error`	computes internal errors for calibration models that were trained on all available `actual`/`predicted` tuples. Testing is performed with the same set. Be careful to interpret those error values, as they are not cross-validated. Default: TRUE
`evaluate_CV_error`	computes cross-validation error. `folds` times cross validation is repeated `n_seeds` times with changing seeds. The trained models and the their calibration and discrimination errors are returned. Evaluation of CV errors can take some time to compute, depending on the number of repetitions specified in `n_seeds`, Default: TRUE
`folds`	number of folds in the cross-validation of the calibration model. If `folds` is set to 1, no CV is performed and `summary_CV` can be calculated. Default: 10
`n_seeds`	`n_seeds` determines how often random data set partition is repeated with varying seed. If `folds` is 1, `n_seeds` should be set to 1, too. Default: 30
`nCores`	`nCores` how many cores should be used during parallelisation. Default: 4

Details

parallised execution of random data set splits for the Cross-Validation procedure over n_seeds

Value

A list object with the following components:

`calibration_models`	a list of all trained calibration models, which can be used in the `predict_calibratR` method.
`summary_CV`	a list containing information on the CV errors of the implemented models
`summary_no_CV`	a list containing information on the internal errors of the implemented models
`predictions`	calibrated predictions for the original `predicted` values
`n_seeds`	number of random data set partitions into training and test set for `folds`-times CV

Author(s)

Johanna Schwarz

Examples

 ## Loading dataset in environment
 data(example)
 actual <- example$actual
 predicted <- example$predicted

 ## Create calibration models
 calibration_model <- calibrate(actual, predicted,
                              model_idx = c(1,2),
                              FALSE, FALSE, folds = 10, n_seeds = 1, nCores = 2)

[Package CalibratR version 0.1.2 Index]