calibration_simplex {CalSim}  R Documentation 
Generates an object of class calibration_simplex
which can be used to assess the calibration
of ternary probability forecasts. The Calibration Simplex can be seen as generalization of the reliability diagram
for binary probability forecasts. For details on the interpretation of the calibration simplex, see Wilks (2013). Be
aware that some minor changes have been made compared to the calibration simplex as suggested by Wilks (2013) (see note below).
As a somewhat experimental feature, multinomial pvalues can be used for uncertainty quantification, that is, as a tool to judge whether the observed discrepancies may be merely coincidental or whether the predictions may in fact be miscalibrated, see Resin (2020, Section 4.2).
calibration_simplex(n, p1, p2, p3, obs, test_stat, percentagewise) ## Default S3 method: calibration_simplex( n = 10, p1 = NULL, p2 = NULL, p3 = NULL, obs = NULL, test_stat = "LLR", percentagewise = FALSE )
n 
A natural number. 
p1 
A vector containing the forecasted probabilities for the first (1) category, e.g. belownormal. 
p2 
A vector containing the forecasted probabilities for the second (2) category, e.g. nearnormal. 
p3 
A vector containing the forecasted probabilities for the third (3) category, e.g. abovenormal. 
obs 
A vector containing the observed outcomes (Categories are encoded as 1 (e.g. belownormal), 2 (e.g. nearnormal) and 3 (e.g. abovenormal)). 
test_stat 
A string indicating which test statistic is to be used for the multinomial test in each bin. Options are "LLR" (loglikelihood ratio; default), "Chisq" (Pearson's chisquare) and "Prob" (probability mass statistic). See details 
percentagewise 
Logical, specifying whether probabilities are percentagewise (summing to 100) or not (summing to 1). 
Only two of the three forecast probability vectors (p1
, p2
and p3
) need to be specified.
The pvalues are based on multinomial tests comparing the observed frequencies within a bin
with the average forecast probabilities within the bin as outlined in Resin (2020, Section 4.2).
The pvalues are exact and do not rely on asymptotics, however, it is assumed that the true
distribution (under the hypothesis of forecast calibration) within each bin
is approximated well by the multinomial distribution. If n
is small the
approximation may be poor, resulting in unreliable pvalues. pValues less than 0.0001 are not
exact but merely indicate a value less than 0.0001.
A list with class "calibration_simplex" containing

As input by user or default. 

Computed from 

Total number of observations. 

Vector of length 

Matrix containing the observed outcome frequencies within each bin. 

Matrix containing the average forecast probabilities within each bin. 

Exact multinomial pvalues within each bin. See details. 
Object of class calibration_simplex
.
In contrast to the calibration simplex proposed by Daniel S. Wilks, 2013, the simplex has been
mirrored at the diagonal through the left bottom hexagon. The miscalibration error is by default calculated
precisely (in each bin as the difference of the relative frequencies of each class and the
average forecast probabilities) instead of approximately (using Wilks original formula).
Approximate errors can be used by setting true_error = FALSE
when using plot.calibration_simplex
.
Daniel S. Wilks, 2013, The Calibration Simplex: A Generalization of the Reliability Diagram for ThreeCategory Probability Forecasts, Weather and Forecasting, 28, 12101218
Resin, J. (2020), A Simple Algorithm for Exact Multinomial Tests, Preprint https://arxiv.org/abs/2008.12682
attach(ternary_forecast_example) #see also documentation of sample data #?ternary_forecast_example # Calibrated forecast sample calsim0 = calibration_simplex(p1 = p1, p3 = p3, obs = obs0) plot(calsim0,use_pvals = TRUE) # with multinomial pvalues # Overconfident forecast sample calsim1 = calibration_simplex(p1 = p1, p3 = p3, obs = obs1) plot(calsim1) # Underconfident forecast sample calsim2 = calibration_simplex(p1 = p1, p3 = p3, obs = obs2) plot(calsim2,use_pvals = TRUE) # with multinomial pvalues # Unconditionally biased forecast sample calsim3 = calibration_simplex(p1 = p1, p3 = p3, obs = obs3) plot(calsim3) # Using a different number of bins calsim = calibration_simplex(n=4, p1 = p1, p3 = p3, obs = obs3) plot(calsim) calsim = calibration_simplex(n=13, p1 = p1, p3 = p3, obs = obs3) plot(calsim, # using some additional plotting parameters: error_scale = 0.5, # errors are less pronounced (smaller shifts) min_bin_freq = 100, # dots are plotted only for bins, # which contain at least 100 forecastoutcome pairs category_labels = c("belownormal","nearnormal","abovenormal"), main = "Sample calibration simplex") detach(ternary_forecast_example)