R: Likelihood Ratio Test for Calibration

llo_lrt {BRcal}

R Documentation

Likelihood Ratio Test for Calibration

Description

Perform a likelihood ratio test for if calibration a set of probability predictions, x, are well-calibrated given a corresponding set of binary event outcomes, y. See Guthrie and Franck (2024).

Usage

llo_lrt(
  x,
  y,
  event = 1,
  optim_details = TRUE,
  epsilon = .Machine$double.eps,
  ...
)

Arguments

`x`	a numeric vector of predicted probabilities of an event. Must only contain values in [0,1].
`y`	a vector of outcomes corresponding to probabilities in `x`. Must only contain two unique values (one for "events" and one for "non-events"). By default, this function expects a vector of 0s (non-events) and 1s (events).
`event`	Value in `y` that represents an "event". Default value is 1.
`optim_details`	Logical. If `TRUE`, the list returned by optim when minimizing the negative log likelihood is also returned by this function.
`epsilon`	Amount by which probabilities are pushed away from 0 or 1 boundary for numerical stability. If a value in `x` < `epsilon`, it will be replaced with `epsilon`. If a value in `x` > `1-epsilon`, that value will be replaced with `1-epsilon`.
`...`	Additional arguments to be passed to optim.

Details

This likelihood ratio test is based on the following likelihood

\pi(\mathbf{x}, \mathbf{y} | \delta, \gamma) = \prod_{i=1}^n c(x_i;\delta, \gamma)^{y_i} \left[1-c(x_i;\delta, \gamma)\right]^{1-y_i}

where c(x_i; \delta, \gamma) is the Linear in Log Odds (LLO) function, \delta>0 is the shift parameter on the logs odds scale, and \gamma \in \mathbb{R} is the scale parameter on the log odds scale.

As \delta = \gamma = 1 corresponds to no shift or scaling of probabilities, i.e. x is well calibrated given corresponding outcomes y. Thus the hypotheses for this test are as follows:

H_0: \delta = 1, \gamma = 1 \text{ (Probabilities are well calibrated)}

H_1: \delta \neq 1 \text{ and/or } \gamma \neq 1 \text{ (Probabilities are poorly calibrated)}.

The likelihood ratio test statistics for H_0 is

\lambda_{LR} = -2 log\left[\frac{\pi(\delta =1, \gamma=1|\mathbf{x}, \mathbf{y})}{\pi(\delta = \hat\delta_{MLE}, \gamma = \hat\gamma_{MLE}| \mathbf{x},\mathbf{y})}\right]

where \lambda_{LR}\stackrel{H_0}{\sim}{\chi^2_2} asymptotically under the null hypothesis H_0, and \hat{\delta}_{MLE} and \hat{\gamma}_{MLE} are the maximum likelihood estimates for \delta and \gamma.

Value

A list with the following attributes:

`test_stat`	The test statistic `\lambda_{LR}` from the likelihood ratio test.
`pval`	The p-value from the likelihood ratio test.
`MLEs`	Maximum likelihood estimates for `\delta` and `\gamma`.
`optim_details`	If `optim_details = TRUE`, the list returned by optim when minimizing the negative log likelihood, includes convergence information, number of iterations, and achieved negative log likelihood value and MLEs.

References

Guthrie, A. P., and Franck, C. T. (2024) Boldness-Recalibration for Binary Event Predictions, The American Statistician 1-17.

Examples

# Simulate 100 predicted probabilities
x <- runif(100)
# Simulated 100 binary event outcomes using `x`
y <- rbinom(100, 1, x)  # By construction, `x` is well calibrated.

# Run the likelihood ratio test on `x` and `y`
llo_lrt(x, y, optim_details=FALSE)

# Use optim_details = TRUE to see returned info from call to optim(),
# details useful for checking convergence
llo_lrt(x, y, optim_details=TRUE)  # no convergence problems in this example

# Use different start value in `optim()` call, start at delta = 5, gamma = 5
llo_lrt(x, y, optim_details=TRUE, par=c(5,5))

# Use `L-BFGS-B` instead of `Nelder-Mead`
llo_lrt(x, y, optim_details=TRUE, method = "L-BFGS-B")  # same result

# What if events are defined by text instead of 0 or 1?
y2 <- ifelse(y==0, "Loss", "Win")
llo_lrt(x, y2, event="Win", optim_details=FALSE)  # same result

# What if we're interested in the probability of loss instead of win?
x2 <- 1 - x
llo_lrt(x2, y2, event="Loss", optim_details=FALSE)

# Push probabilities away from bounds by 0.000001
x3 <- c(runif(50, 0, 0.0001), runif(50, .9999, 1))
y3 <- rbinom(100, 1, 0.5)
llo_lrt(x3, y3, epsilon=0.000001)

[Package BRcal version 0.0.4 Index]