plattCalibration {MiMIR} | R Documentation |
plattCalibration
Description
Function that calculates the Platt Calibrations
Usage
plattCalibration(r.calib, p.calib, nbins = 10, pl = FALSE)
Arguments
r.calib |
observed binary phenotype |
p.calib |
predicted probabilities |
nbins |
number of bins to create the plots |
pl |
logical indicating if the function should plot the Reliability diagram and histogram of the calibrations |
Details
Many popular machine learning algorithms produce inaccurate predicted probabilities, especially when applied on a dataset different than the training set. Platt (1999) proposed an adjustment, in which the original probabilities are used as a predictor in a single-variable logistic regression to produce more accurate adjusted predicted probabilities. The function will also help the evaluation of the calibration, by plotting: reliability diagrams and distributions of the calibrated and non-calibrated probabilities. The reliability diagrams plots the mean predicted value within a certain range of posterior probabilities, against the fraction of accurately predicted values. Finally, we also report accuracy measures for the calibrations: the ECE, MCE and the Log-Loss of the probabilities before and after calibration.
Value
list with samples, responses, calibrations, ECE, MCE and calibration plots if save==T
References
This is a function originally created for the package in eRic, under the name prCalibrate and modified ad hoc for our purposes (Github)
J. C. Platt, 'Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods', in Advances in Large Margin Classifiers, 1999, pp. 61-74.
Examples
library(stats)
library(plotly)
#load the dataset
met <- synthetic_metabolic_dataset
phen <- synthetic_phenotypic_dataset
#Calculating the binarized surrogates
b_phen<-binarize_all_pheno(phen)
#Apply a surrogate models and plot the ROC curve
surr<-calculate_surrogate_scores(met, phen,MiMIR::PARAM_surrogates, bin_names=colnames(b_phen))
#Calibration of the surrogate sex
real_data<-as.numeric(b_phen$sex)
pred_data<-surr$surrogates[,"s_sex"]
plattCalibration(r.calib=real_data, p.calib=pred_data, nbins = 10, pl=TRUE)