R: Decomposing scores into miscalibration, discrimination and...

summary.reliabilitydiag {reliabilitydiag}

R Documentation

Decomposing scores into miscalibration, discrimination and uncertainty

Description

An object of class reliabilitydiag contains the observations, the original forecasts, and recalibrated forecasts given by isotonic regression. The function summary.reliabilitydiag calculates quantitative measures of predictive performance, miscalibration, discrimination, and uncertainty, for each of the prediction methods in relation to their recalibrated version.

Usage

## S3 method for class 'reliabilitydiag'
summary(object, ..., score = "brier")

Arguments

`object`	an object inheriting from the class `'reliabilitydiag'`.
`...`	further arguments to be passed to or from methods.
`score`	currently only "brier" or a vectorized scoring function, that is, `function(observation, prediction)`.

Details

Predictive performance is measured by the mean score of the original forecast values, denoted by S.

Uncertainty, denoted by UNC, is the mean score of a constant prediction at the value of the average observation. It is the highest possible mean score of a calibrated prediction method.

Discrimination, denoted by DSC, is UNC minus the mean score of the PAV-recalibrated forecast values. A small value indicates a low information content (low signal) in the original forecast values.

Miscalibration, denoted by MCB, is S minus the mean score of the PAV-recalibrated forecast values. A high value indicates that predictive performance of the prediction method can be improved by recalibration.

These measures are related by the following equation,

S = MCB - DSC + UNC.

Score decompositions of this type have been studied extensively, but the optimality of the PAV solution ensures that MCB is nonnegative, regardless of the chosen (admissible) scoring function. This is a unique property achieved by choosing PAV-recalibration.

If deviating from the Brier score as performance metric, make sure to choose a proper scoring rule for binary events, or equivalently, a scoring function with outcome space {0, 1} that is consistent for the expectation functional.

Value

A 'summary.reliability' object, which is also a tibble (see tibble::tibble()) with columns:

`forecast`	the name of the prediction method.
`mean_score`	the mean score of the original forecast values.
`miscalibration`	a measure of miscalibration (how reliable is the prediction method?), smaller is better.
`discrimination`	a measure of discrimination (how variable are the recalibrated predictions?), larger is better.
`uncertainty`	the mean score of a constant prediction at the value of the average observation.

Examples

data("precip_Niamey_2016", package = "reliabilitydiag")
r <- reliabilitydiag(
  precip_Niamey_2016[c("Logistic", "EMOS", "ENS", "EPC")],
  y = precip_Niamey_2016$obs,
  region.level = NA
)
summary(r)
summary(r, score = function(y, x) (x - y)^2)

[Package reliabilitydiag version 0.2.1 Index]