brcal {BRcal}R Documentation

Boldness-Recalibration for Binary Events

Description

Perform Bayesian boldness-recalibration as specified in Guthrie and Franck (2024). Boldness-recalibration maximizes the spread in predictions (x) subject to a constraint on the minimum tolerable posterior probability of calibration (t).

Usage

brcal(
  x,
  y,
  t = 0.95,
  Pmc = 0.5,
  tau = FALSE,
  event = 1,
  start_at_MLEs = TRUE,
  x0 = NULL,
  lb = c(1e-05, -Inf),
  ub = c(Inf, Inf),
  maxeval = 500,
  maxtime = NULL,
  xtol_rel_inner = 1e-06,
  xtol_rel_outer = 1e-06,
  print_level = 3,
  epsilon = .Machine$double.eps,
  opts = NULL,
  optim_options = NULL
)

Arguments

x

a numeric vector of predicted probabilities of an event. Must only contain values in [0,1].

y

a vector of outcomes corresponding to probabilities in x. Must only contain two unique values (one for "events" and one for "non-events"). By default, this function expects a vector of 0s (non-events) and 1s (events).

t

Minimum tolerable level of calibration in [0,1].

Pmc

The prior model probability for the calibrated model M_c.

tau

Logical. If TRUE, the optimization operates on \tau = log(\delta) instead of \delta. See details.

event

Value in y that represents an "event". Default value is 1.

start_at_MLEs

Logical. If TRUE, the optimizer will start at x_0 = the maximum likelihood estimates for \delta and \gamma. Otherwise, the user must specify x_0.

x0

Vector with starting locations for \delta and \gamma. This argument is ignored when start_at_MLEs = TRUE.

lb

Vector with lower bounds for \delta and \gamma. Use -Inf to indicate no lower bound.

ub

Vector with upper bounds for \delta and \gamma. Use Inf to indicate no upper bound.

maxeval

Value passed to nloptr() to stop optimization when the number of function evaluations exceeds maxeval.

maxtime

Value passed to nloptr() to stop optimization when evaluation time (in seconds) exceeds maxtime.

xtol_rel_inner

Value passed to nloptr() to stop the inner optimization routine when the parameter estimates for \delta and \gamma change by less than xtol_rel_inner.

xtol_rel_outer

Value passed to nloptr() to stop the outer optimization routine when the parameter estimates for \delta and \gamma change by less than xtol_rel_inner.

print_level

Value passed to nloptr() to control how much output is printed during optimization. Default is to print the most information allowable by nloptr(). Specify 0 to suppress all output.

epsilon

Amount by which probabilities are pushed away from 0 or 1 boundary for numerical stability. If a value in x < epsilon, it will be replaced with epsilon. If a value in x > 1-epsilon, that value will be replaced with 1-epsilon.

opts

List with options to be passed to nloptr.

optim_options

List with options to be passed to optim.

Details

The objective function in boldness-recalibration is

f(\delta, \gamma) = -sd(\mathbf{x}')

and the constraint is

g(\delta, \gamma) = -(P(M_c|\mathbf{y}, \mathbf{x}')-t) \leq 0.

As both the objective and constraint functions are non-linear with respect to \delta and \gamma, we use nloptr for this optimization rather than optim. Note that we use x to denote a vector of predicted probabilities, nloptr() uses x to denote the parameters being optimized. Thus, starting values for \delta and \gamma are passed via argument x0 and all output refers to the objective and constraint as f(x) and g(x).

By default, this function uses the Augmented Lagrangian Algorithm (AUGLAG) (Conn et. al. 1991, Birgin and Martinez 2008) as the outer optimization routine and Sequential Least-Squares Quadratic Programming (SLSQP) (Dieter 1988, Dieter 1994) as the inner optimization routine.

Value

A list with the following attributes:

nloptr

The list returned by nloptr() including convergence information, number of iterations, and more.

Pmc

The prior model probability for the calibrated model M_c specified in function call.

t

Desired level of calibration in [0,1] specified in function call.

BR_params

(100*t)% Boldness-recalibration estimates for \delta and \gamma.

sb

The Bayesian Information Criteria (BIC) for the calibrated model M_c.

probs

Vector of (100*t)% boldness-recalibrated probabilities.

Adjusting call to nloptr()

For more control over the optimization routine conducted by nloptr(), the user may specify their own options via the opts argument. Note that any objective, constraint, or gradient functions specified by the user will be overwritten by those specified in this package. See the documentation for nloptr() and the NLopt website for full details (https://nlopt.readthedocs.io/en/latest/).

Adjusting call to optim()

While optim() is not used for the non-linear constrained optimization for finding he boldness-recalibration parameters, it is used in the constraint function as it involves the posterior model posterior. Because of this, we do allow users to pass additional arguments to optim to be used in this calculation. However, rather than use the ..., users should pass these arguments to optim_options via a list.

Optimizing over \tau

When tau=TRUE, the optimization routine operates relative to \tau = log(\delta) instead of \delta. Specification of start location x0 and bounds lb, ub should still be specified in terms of \delta. The brcal function will automatically convert from \delta to \tau. In the returned list, BR_params will always report in terms of \delta. However, the results returned in nloptr will reflect whichever scale nloptr() optimized on.

References

Birgin, E. G., and Martínez, J. M. (2008) Improving ultimate convergence of an augmented Lagrangian method, Optimization Methods and Software vol. 23, no. 2, p. 177-195.

Conn, A. R., Gould, N. I. M., and Toint, P. L. (1991) A globally convergent augmented Lagrangian algorithm for optimization with general constraints and simple bounds, SIAM Journal of Numerical Analysis vol. 28, no. 2, p. 545-572.

Guthrie, A. P., and Franck, C. T. (2024) Boldness-Recalibration for Binary Event Predictions, The American Statistician 1-17.

Johnson, S. G., The NLopt nonlinear-optimization package, https://nlopt.readthedocs.io/en/latest/.

Kraft, D. (1988) A software package for sequential quadratic programming", Technical Report DFVLR-FB 88-28, Institut für Dynamik der Flugsysteme, Oberpfaffenhofen.

Kraft, D. (1994) Algorithm 733: TOMP-Fortran modules for optimal control calculations, ACM Transactions on Mathematical Software, vol. 20, no. 3, pp. 262-281.

Examples


# Simulate 50 predicted probabilities
x <- runif(50)
# Simulated 50 binary event outcomes using x
y <- rbinom(50, 1, x)  # By construction, x is well calibrated.

# Perform 90% boldness-recalibration by setting t=0.9
# To suppress all output from nloptr() for each iteration use print_level=0
# (For reduced output at each iteration used print_level=1 or 2)
# To specify different starting values, use x0 and set start_at_MLEs=FALSE
brcal(x, y, t=0.9, x0=c(1,1), start_at_MLEs=FALSE, print_level=0)

# Adjust stopping criteria set max number of evaluations to 50 (maxeval) OR 
# stop after 0.5 second (maxtime)
# and set optimization bounds using lb and ub
brcal(x, y, maxeval = 50, maxtime = 0.5, lb=c(0.001, 0), ub=c(10, 10), print_level=0)

# Specify different options for nloptr & optim
brcal(x, y, opts=list(xtol_abs=0.01,
                     local_opts=list(algorithm="NLOPT_LD_MMA")), 
                     optim_options=list(method = "L-BFGS-B", lower = c(0, -1), 
                              upper = c(10, 25), control=list(factr=0.01)),
                              print_level=0)
                              
# Push probabilities away from bounds by 0.000001 and
# Stop outer optimization when parameters change by less than .001
x3 <- c(runif(25, 0, 0.0001), runif(25, .9999, 1))
y3 <- rbinom(50, 1, 0.5)
brcal(x3, y3, epsilon=0.000001, xtol_rel_outer = .01, print_level=0)

# See vignette for more examples


[Package BRcal version 0.0.4 Index]