R: Boldness-Recalibration for Binary Events

brcal {BRcal}

R Documentation

Boldness-Recalibration for Binary Events

Description

Perform Bayesian boldness-recalibration as specified in Guthrie and Franck (2024). Boldness-recalibration maximizes the spread in predictions (x) subject to a constraint on the minimum tolerable posterior probability of calibration (t).

Usage

brcal(
  x,
  y,
  t = 0.95,
  Pmc = 0.5,
  tau = FALSE,
  event = 1,
  start_at_MLEs = TRUE,
  x0 = NULL,
  lb = c(1e-05, -Inf),
  ub = c(Inf, Inf),
  maxeval = 500,
  maxtime = NULL,
  xtol_rel_inner = 1e-06,
  xtol_rel_outer = 1e-06,
  print_level = 3,
  epsilon = .Machine$double.eps,
  opts = NULL,
  optim_options = NULL
)

Arguments

`x`	a numeric vector of predicted probabilities of an event. Must only contain values in [0,1].
`y`	a vector of outcomes corresponding to probabilities in `x`. Must only contain two unique values (one for "events" and one for "non-events"). By default, this function expects a vector of 0s (non-events) and 1s (events).
`t`	Minimum tolerable level of calibration in [0,1].
`Pmc`	The prior model probability for the calibrated model `M_c`.
`tau`	Logical. If `TRUE`, the optimization operates on `\tau = log(\delta)` instead of `\delta`. See details.
`event`	Value in `y` that represents an "event". Default value is 1.
`start_at_MLEs`	Logical. If `TRUE`, the optimizer will start at `x_0` = the maximum likelihood estimates for `\delta` and `\gamma`. Otherwise, the user must specify `x_0`.
`x0`	Vector with starting locations for `\delta` and `\gamma`. This argument is ignored when start_at_MLEs = TRUE.
`lb`	Vector with lower bounds for `\delta` and `\gamma`. Use `-Inf` to indicate no lower bound.
`ub`	Vector with upper bounds for `\delta` and `\gamma`. Use `Inf` to indicate no upper bound.
`maxeval`	Value passed to `nloptr()` to stop optimization when the number of function evaluations exceeds `maxeval`.
`maxtime`	Value passed to `nloptr()` to stop optimization when evaluation time (in seconds) exceeds `maxtime`.
`xtol_rel_inner`	Value passed to `nloptr()` to stop the inner optimization routine when the parameter estimates for `\delta` and `\gamma` change by less than `xtol_rel_inner`.
`xtol_rel_outer`	Value passed to `nloptr()` to stop the outer optimization routine when the parameter estimates for `\delta` and `\gamma` change by less than `xtol_rel_inner`.
`print_level`	Value passed to `nloptr()` to control how much output is printed during optimization. Default is to print the most information allowable by `nloptr()`. Specify `0` to suppress all output.
`epsilon`	Amount by which probabilities are pushed away from 0 or 1 boundary for numerical stability. If a value in `x` < `epsilon`, it will be replaced with `epsilon`. If a value in `x` > `1-epsilon`, that value will be replaced with `1-epsilon`.
`opts`	List with options to be passed to `nloptr`.
`optim_options`	List with options to be passed to `optim`.

Details

The objective function in boldness-recalibration is

f(\delta, \gamma) = -sd(\mathbf{x}')

and the constraint is

g(\delta, \gamma) = -(P(M_c|\mathbf{y}, \mathbf{x}')-t) \leq 0.

As both the objective and constraint functions are non-linear with respect to \delta and \gamma, we use nloptr for this optimization rather than optim. Note that we use x to denote a vector of predicted probabilities, nloptr() uses x to denote the parameters being optimized. Thus, starting values for \delta and \gamma are passed via argument x0 and all output refers to the objective and constraint as f(x) and g(x).

By default, this function uses the Augmented Lagrangian Algorithm (AUGLAG) (Conn et. al. 1991, Birgin and Martinez 2008) as the outer optimization routine and Sequential Least-Squares Quadratic Programming (SLSQP) (Dieter 1988, Dieter 1994) as the inner optimization routine.

Value

A list with the following attributes:

`nloptr`	The list returned by `nloptr()` including convergence information, number of iterations, and more.
`Pmc`	The prior model probability for the calibrated model `M_c` specified in function call.
`t`	Desired level of calibration in [0,1] specified in function call.
`BR_params`	(100`*`t)% Boldness-recalibration estimates for `\delta` and `\gamma`.
`sb`	The Bayesian Information Criteria (BIC) for the calibrated model `M_c`.
`probs`	Vector of (100`*`t)% boldness-recalibrated probabilities.

Adjusting call to `nloptr()`

For more control over the optimization routine conducted by nloptr(), the user may specify their own options via the opts argument. Note that any objective, constraint, or gradient functions specified by the user will be overwritten by those specified in this package. See the documentation for nloptr() and the NLopt website for full details (https://nlopt.readthedocs.io/en/latest/).

Adjusting call to `optim()`

While optim() is not used for the non-linear constrained optimization for finding he boldness-recalibration parameters, it is used in the constraint function as it involves the posterior model posterior. Because of this, we do allow users to pass additional arguments to optim to be used in this calculation. However, rather than use the ..., users should pass these arguments to optim_options via a list.

Optimizing over `\tau`

When tau=TRUE, the optimization routine operates relative to \tau = log(\delta) instead of \delta. Specification of start location x0 and bounds lb, ub should still be specified in terms of \delta. The brcal function will automatically convert from \delta to \tau. In the returned list, BR_params will always report in terms of \delta. However, the results returned in nloptr will reflect whichever scale nloptr() optimized on.

References

Birgin, E. G., and Martínez, J. M. (2008) Improving ultimate convergence of an augmented Lagrangian method, Optimization Methods and Software vol. 23, no. 2, p. 177-195.

Conn, A. R., Gould, N. I. M., and Toint, P. L. (1991) A globally convergent augmented Lagrangian algorithm for optimization with general constraints and simple bounds, SIAM Journal of Numerical Analysis vol. 28, no. 2, p. 545-572.

Guthrie, A. P., and Franck, C. T. (2024) Boldness-Recalibration for Binary Event Predictions, The American Statistician 1-17.

Johnson, S. G., The NLopt nonlinear-optimization package, https://nlopt.readthedocs.io/en/latest/.

Kraft, D. (1988) A software package for sequential quadratic programming", Technical Report DFVLR-FB 88-28, Institut für Dynamik der Flugsysteme, Oberpfaffenhofen.

Kraft, D. (1994) Algorithm 733: TOMP-Fortran modules for optimal control calculations, ACM Transactions on Mathematical Software, vol. 20, no. 3, pp. 262-281.

Examples


# Simulate 50 predicted probabilities
x <- runif(50)
# Simulated 50 binary event outcomes using x
y <- rbinom(50, 1, x)  # By construction, x is well calibrated.

# Perform 90% boldness-recalibration by setting t=0.9
# To suppress all output from nloptr() for each iteration use print_level=0
# (For reduced output at each iteration used print_level=1 or 2)
# To specify different starting values, use x0 and set start_at_MLEs=FALSE
brcal(x, y, t=0.9, x0=c(1,1), start_at_MLEs=FALSE, print_level=0)

# Adjust stopping criteria set max number of evaluations to 50 (maxeval) OR 
# stop after 0.5 second (maxtime)
# and set optimization bounds using lb and ub
brcal(x, y, maxeval = 50, maxtime = 0.5, lb=c(0.001, 0), ub=c(10, 10), print_level=0)

# Specify different options for nloptr & optim
brcal(x, y, opts=list(xtol_abs=0.01,
                     local_opts=list(algorithm="NLOPT_LD_MMA")), 
                     optim_options=list(method = "L-BFGS-B", lower = c(0, -1), 
                              upper = c(10, 25), control=list(factr=0.01)),
                              print_level=0)
                              
# Push probabilities away from bounds by 0.000001 and
# Stop outer optimization when parameters change by less than .001
x3 <- c(runif(25, 0, 0.0001), runif(25, .9999, 1))
y3 <- rbinom(50, 1, 0.5)
brcal(x3, y3, epsilon=0.000001, xtol_rel_outer = .01, print_level=0)

# See vignette for more examples

[Package BRcal version 0.0.4 Index]