R: Bayesian Model Selection-Based Calibration Assessment

bayes_ms {BRcal}

R Documentation

Bayesian Model Selection-Based Calibration Assessment

Description

Perform Bayesian model selection-based approach to determine if a set of predicted probabilities x is well calibrated given the corresponding set of binary event outcomes y as described in Guthrie and Franck (2024).

Usage

bayes_ms(
  x,
  y,
  Pmc = 0.5,
  event = 1,
  optim_details = TRUE,
  epsilon = .Machine$double.eps,
  ...
)

Arguments

`x`	a numeric vector of predicted probabilities of an event. Must only contain values in [0,1].
`y`	a vector of outcomes corresponding to probabilities in `x`. Must only contain two unique values (one for "events" and one for "non-events"). By default, this function expects a vector of 0s (non-events) and 1s (events).
`Pmc`	The prior model probability for the calibrated model `M_c`.
`event`	Value in `y` that represents an "event". Default value is 1.
`optim_details`	Logical. If `TRUE`, the list returned by optim when minimizing the negative log likelihood is also returned by this function.
`epsilon`	Amount by which probabilities are pushed away from 0 or 1 boundary for numerical stability. If a value in `x` < `epsilon`, it will be replaced with `epsilon`. If a value in `x` > `1-epsilon`, that value will be replaced with `1-epsilon`.
`...`	Additional arguments to be passed to optim.

Details

This function compares a well calibrated model, M_c where \delta = \gamma = 1 to an uncalibrated model, M_u where \delta>0, \gamma \in \mathbb{R}.

The posterior model probability of M_c given the observed outcomes y (returned as posterior_model_prob) is expressed as

P(M_c|\mathbf{y}) = \frac{P(\mathbf{y}|M_c) P(M_c)}{P(\mathbf{y}|M_c) P(M_c) + P(\mathbf{y}|M_{u}) P(M_{u})}

where P(\mathbf{y}|M_i) is the integrated likelihoof of y given M_i and P(M_i) is the prior probability of model i, i \in \{c,u\}. By default, this function uses P(M_c) = P(M_u) = 0.5. To set a different prior for P(M_c), use Pmc, and P(M_u) will be set to 1 - Pmc.

The Bayes factor (returned as BF) compares M_u to M_c. This value is approximated via the following large sample Bayesian Information Criteria (BIC) approximation (see Kass & Raftery 1995, Kass & Wasserman 1995)

BF = \frac{P(\mathbf{y}|M_{u})}{P(\mathbf{y}|M_c)} = \approx exp\left\{ -\frac{1}{2}(BIC_u - BIC_c) \right\}

where the BIC for the calibrated model (returned as BIC_mc) is

BIC_c = - 2 \times log(\pi(\delta = 1, \gamma =1|\mathbf{x},\mathbf{y}))

and the BIC for the uncalibrated model (returned as BIC_mu) is

BIC_u = 2\times log(n) - 2\times log(\pi(\hat\delta_{MLE}, \hat\gamma_{MLE}|\mathbf{x},\mathbf{y})).

Value

A list with the following attributes:

`Pmc`	The prior model probability for the calibrated model `M_c`.
`BIC_Mc`	The Bayesian Information Criteria (BIC) for the calibrated model `M_c`.
`BIC_Mu`	The Bayesian Information Criteria (BIC) for the uncalibrated model `M_u`.
`BF`	The Bayes Factor of uncalibrated model over calibrated model.
`posterior_model_prob`	The posterior model probability of the calibrated model `M_c` given the observed outcomes `y`, i.e. `P(M_c\|y)`.
`MLEs`	Maximum likelihood estimates for `\delta` and `\gamma`.
`optim_details`	If `optim_details = TRUE`, the list returned by optim when minimizing the negative log likelihood, includes convergence information, number of iterations, and achieved negative log likelihood value and MLEs.

References

Guthrie, A. P., and Franck, C. T. (2024) Boldness-Recalibration for Binary Event Predictions, The American Statistician 1-17.

Kass, R. E., and Raftery, A. E. (1995) Bayes factors. Journal of the American Statistical Association

Kass, R. E., and Wassermann, L. (1995) A reference bayesian test for nested hypotheses and its relationship to the schwarz criterion. Journal of the American Statistical Association

Examples

# Simulate 100 predicted probabilities
x <- runif(100)
# Simulated 100 binary event outcomes using x
y <- rbinom(100, 1, x)  # By construction, x is well calibrated.

# Use bayesian model selection approach to check calibration of x given outcomes y
bayes_ms(x, y, optim_details=FALSE)

# To specify different prior model probability of calibration, use Pmc
# Prior model prob of 0.7:
bayes_ms(x, y, Pmc=0.7)
# Prior model prob of 0.2
bayes_ms(x, y, Pmc=0.2)

# Use optim_details = TRUE to see returned info from call to optim(),
# details useful for checking convergence
bayes_ms(x, y, optim_details=TRUE)  # no convergence problems in this example

# Pass additional arguments to optim() via ... (see optim() for details)
# Specify different start values via par in optim() call, start at delta = 5, gamma = 5:
bayes_ms(x, y, optim_details=TRUE, par=c(5,5))
# Specify different optimization algorithm via method, L-BFGS-B instead of Nelder-Mead:
bayes_ms(x, y, optim_details=TRUE, method = "L-BFGS-B")  # same result

# What if events are defined by text instead of 0 or 1?
y2 <- ifelse(y==0, "Loss", "Win")
bayes_ms(x, y2, event="Win", optim_details=FALSE)  # same result

# What if we're interested in the probability of loss instead of win?
x2 <- 1 - x
bayes_ms(x2, y2, event="Loss", optim_details=FALSE)

# Push probabilities away from bounds by 0.000001
x3 <- c(runif(50, 0, 0.0001), runif(50, .9999, 1))
y3 <- rbinom(100, 1, 0.5)
bayes_ms(x3, y3, epsilon=0.000001)

[Package BRcal version 0.0.4 Index]