bayes_ms {BRcal} | R Documentation |
Bayesian Model Selection-Based Calibration Assessment
Description
Perform Bayesian model selection-based approach to determine if a set of
predicted probabilities x
is well calibrated given the corresponding set of
binary event outcomes y
as described in Guthrie and Franck (2024).
Usage
bayes_ms(
x,
y,
Pmc = 0.5,
event = 1,
optim_details = TRUE,
epsilon = .Machine$double.eps,
...
)
Arguments
x |
a numeric vector of predicted probabilities of an event. Must only contain values in [0,1]. |
y |
a vector of outcomes corresponding to probabilities in |
Pmc |
The prior model probability for the calibrated model |
event |
Value in |
optim_details |
Logical. If |
epsilon |
Amount by which probabilities are pushed away from 0 or 1
boundary for numerical stability. If a value in |
... |
Additional arguments to be passed to optim. |
Details
This function compares a well calibrated model, M_c
where \delta =
\gamma = 1
to an uncalibrated model, M_u
where \delta>0, \gamma \in
\mathbb{R}
.
The posterior model probability of M_c
given the observed
outcomes y
(returned as posterior_model_prob
) is expressed as
P(M_c|\mathbf{y})
= \frac{P(\mathbf{y}|M_c) P(M_c)}{P(\mathbf{y}|M_c) P(M_c) + P(\mathbf{y}|M_{u}) P(M_{u})}
where P(\mathbf{y}|M_i)
is the integrated likelihoof of y
given
M_i
and P(M_i)
is the prior probability of model i, i \in
\{c,u\}
. By default, this function uses P(M_c) = P(M_u) = 0.5
. To set a
different prior for P(M_c)
, use Pmc
, and P(M_u)
will be set to
1 - Pmc
.
The Bayes factor (returned as BF
) compares M_u
to M_c
. This
value is approximated via the following large sample Bayesian Information
Criteria (BIC) approximation (see Kass & Raftery 1995, Kass & Wasserman 1995)
BF =
\frac{P(\mathbf{y}|M_{u})}{P(\mathbf{y}|M_c)} = \approx exp\left\{
-\frac{1}{2}(BIC_u - BIC_c) \right\}
where the BIC for the calibrated model
(returned as BIC_mc
) is
BIC_c = - 2 \times log(\pi(\delta = 1, \gamma =1|\mathbf{x},\mathbf{y}))
and the BIC for the uncalibrated model (returned as BIC_mu
) is
BIC_u =
2\times log(n) - 2\times log(\pi(\hat\delta_{MLE}, \hat\gamma_{MLE}|\mathbf{x},\mathbf{y})).
Value
A list with the following attributes:
Pmc |
The prior
model probability for the calibrated model |
BIC_Mc |
The Bayesian Information Criteria (BIC) for the
calibrated model |
BIC_Mu |
The Bayesian Information Criteria
(BIC) for the uncalibrated model |
BF |
The Bayes Factor of uncalibrated model over calibrated model. |
posterior_model_prob |
The posterior model probability of the
calibrated model |
MLEs |
Maximum likelihood estimates for |
optim_details |
If |
References
Guthrie, A. P., and Franck, C. T. (2024) Boldness-Recalibration for Binary Event Predictions, The American Statistician 1-17.
Kass, R. E., and Raftery, A. E. (1995) Bayes factors. Journal of the American Statistical Association
Kass, R. E., and Wassermann, L. (1995) A reference bayesian test for nested hypotheses and its relationship to the schwarz criterion. Journal of the American Statistical Association
Examples
# Simulate 100 predicted probabilities
x <- runif(100)
# Simulated 100 binary event outcomes using x
y <- rbinom(100, 1, x) # By construction, x is well calibrated.
# Use bayesian model selection approach to check calibration of x given outcomes y
bayes_ms(x, y, optim_details=FALSE)
# To specify different prior model probability of calibration, use Pmc
# Prior model prob of 0.7:
bayes_ms(x, y, Pmc=0.7)
# Prior model prob of 0.2
bayes_ms(x, y, Pmc=0.2)
# Use optim_details = TRUE to see returned info from call to optim(),
# details useful for checking convergence
bayes_ms(x, y, optim_details=TRUE) # no convergence problems in this example
# Pass additional arguments to optim() via ... (see optim() for details)
# Specify different start values via par in optim() call, start at delta = 5, gamma = 5:
bayes_ms(x, y, optim_details=TRUE, par=c(5,5))
# Specify different optimization algorithm via method, L-BFGS-B instead of Nelder-Mead:
bayes_ms(x, y, optim_details=TRUE, method = "L-BFGS-B") # same result
# What if events are defined by text instead of 0 or 1?
y2 <- ifelse(y==0, "Loss", "Win")
bayes_ms(x, y2, event="Win", optim_details=FALSE) # same result
# What if we're interested in the probability of loss instead of win?
x2 <- 1 - x
bayes_ms(x2, y2, event="Loss", optim_details=FALSE)
# Push probabilities away from bounds by 0.000001
x3 <- c(runif(50, 0, 0.0001), runif(50, .9999, 1))
y3 <- rbinom(100, 1, 0.5)
bayes_ms(x3, y3, epsilon=0.000001)