calibrate {CORElearn} | R Documentation |
Calibration of probabilities according to the given prior.
Description
Given probability scores predictedProb
as provided for example by a call to predict.CoreModel
and using one of available methods given by methods
the function calibrates predicted probabilities so that they
match the actual probabilities of a binary class 1 provided by correctClass
.
The computed calibration can be applied to the scores returned by that model.
Usage
calibrate(correctClass, predictedProb, class1=1,
method = c("isoReg","binIsoReg","binning","mdlMerge"),
weight=NULL, noBins=10, assumeProbabilities=FALSE)
applyCalibration(predictedProb, calibration)
Arguments
correctClass |
A vector of correct class labels for a binary classification problem. |
predictedProb |
A vector of predicted class 1 (probability) scores. In |
class1 |
A class value (factor) or an index of the class value to be taken as a class to be calibrated. |
method |
One of |
weight |
If specified, should be of the same length as |
noBins |
The value of parameter depends on the parameter |
assumeProbabilities |
If |
calibration |
The list resulting from a call to |
Details
Depending on the specified method
one of the following calibration methods is executed.
-
"isoReg"
isotonic regression calibration based on pair-adjacent violators (PAV) algorithm. -
"binning"
calibration into a pre-specified number of bands given bynoBins
parameter, trying to make bins of equal weight. -
"binIsoReg"
first binning method is executed, following by a isotonic regression calibration. -
"mdlMerge"
first intervals are merged by a MDL gain criterion into a prespecified number of intervals, following by the isotonic regression calibration.
If model="binning"
the parameter noBins
specifies the desired number of bins i.e., calibration bands;
if model="binIsoReg"
the parameter noBins
specifies the number of initial bins that are formed by binning before isotonic regression is applied;
if model="mdlMerge"
the parameter noBins
specifies the number of bins formed after first applying isotonic regression. The most similar bins are merged using MDL criterion.
Value
A function returns a list with two vector components of the same length:
interval |
The boundaries of the intervals. Lower boundary 0 is not explicitly included but should be taken into account. |
calProb |
The calibrated probabilities for each corresponding interval. |
Author(s)
Marko Robnik-Sikonja
References
I. Kononenko, M. Kukar: Machine Learning and Data Mining: Introduction to Principles and Algorithms. Horwood, 2007
A. Niculescu-Mizil, R. Caruana: Predicting Good Probabilities With Supervised Learning. Proceedings of the 22nd International Conference on Machine Learning (ICML'05), 2005
See Also
reliabilityPlot
,
CORElearn
,
predict.CoreModel
.
Examples
# generate data set separately for training the model,
# calibration of probabilities and testing
train <-classDataGen(noInst=200)
cal <-classDataGen(noInst=200)
test <- classDataGen(noInst=200)
# build random forests model with default parameters
modelRF <- CoreModel(class~., train, model="rf", maxThreads=1)
# prediction
predCal <- predict(modelRF, cal, rfPredictClass=FALSE)
predTest <- predict(modelRF, test, rfPredictClass=FALSE)
destroyModels(modelRF) # clean up, model not needed anymore
# calibrate for a chosen class1 and method
class1<-1
calibration <- calibrate(cal$class, predCal$prob[,class1], class1=class1,
method="isoReg",assumeProbabilities=TRUE)
# apply the calibration to the testing set
calibratedProbs <- applyCalibration(predTest$prob[,class1], calibration)
# the calibration of probabilities can be visualized with
# reliabilityPlot function