plot_params {BRcal}R Documentation

Draw image plot of posterior model probability surface.

Description

Function to visualize the posterior model probability of the given set of probabilities, x, after LLO-adjustment via a grid of uniformly spaced set of \delta and \gamma values with optional contours.

Usage

plot_params(
  x = NULL,
  y = NULL,
  z = NULL,
  t_levels = NULL,
  Pmc = 0.5,
  event = 1,
  k = 100,
  dlim = c(1e-04, 5),
  glim = c(1e-04, 5),
  zlim = c(0, 1),
  return_z = FALSE,
  epsilon = .Machine$double.eps,
  contours_only = FALSE,
  main = "Posterior Model Probability of Calibration",
  xlab = "delta",
  ylab = "gamma",
  optim_options = NULL,
  imgplt_options = list(legend.lab = ""),
  contour_options = list(drawlabels = TRUE, labcex = 0.6, lwd = 1, col =
    ifelse(contours_only, "black", "white"))
)

Arguments

x

a numeric vector of predicted probabilities of an event. Must only contain values in [0,1].

y

a vector of outcomes corresponding to probabilities in x. Must only contain two unique values (one for "events" and one for "non-events"). By default, this function expects a vector of 0s (non-events) and 1s (events).

z

Matrix returned by previous call to plot_params() containing posterior model probabilities across k\timesk grid of \delta and \gamma. Assumes z was constructed using the same k, dlim, and glim as the current function call.

t_levels

Vector of desired level(s) of calibration at which to plot contours.

Pmc

The prior model probability for the calibrated model M_c.

event

Value in y that represents an "event". Default value is 1.

k

The number of uniformly spaced \delta and \gamma values used to construct the k\timesk grid.

dlim

Vector with bounds for \delta, must be finite.

glim

Vector with bounds for \gamma, must be finite.

zlim

Vector with bounds for posterior probability of calibration, must be in [0,1].

return_z

Logical. If TRUE, the matrix of posterior model probabilities across the specified k\timesk grid of \delta and \gamma will be returned.

epsilon

Amount by which probabilities are pushed away from 0 or 1 boundary for numerical stability. If a value in x < epsilon, it will be replaced with epsilon. If a value in x > 1-epsilon, that value will be replaced with 1-epsilon.

contours_only

Logical. If TRUE, only the contours at the specified t_levels will be plotted with no color for the posterior model probability across the k\timesk grid of \delta and \gamma.

main

Plot title.

xlab

Label for x-axis.

ylab

Label for x-axis.

optim_options

List of additional arguments to be passed to optim().

imgplt_options

List of additional arguments to be passed to image.plot().

contour_options

List of additional arguments to be passed to contour().

Details

This function leverages the image.plot function from the fields package and the contour function from the graphics package.

The goal of this function is to visualize how the posterior model probability changes under different recalibration parameters, as this is used in boldness-recalibration. To do so, a k by k grid of uniformly spaced potential values for \delta and \gamma are constructed. Then x is LLO-adjusted under each pair of \delta and \gamma values. The posterior model probability of each LLO-adjusted set is calculated and this is the quantity we use to color each grid cell in the image plot to visualize change in calibration. See below for more details on setting the grid.

By default, only the posterior model probability surface is plotted. Argument t_levels can be used to optionally add contours at specified levels of the posterior model probability of calibration. The goal of this is to help visualize different values of t at which they may want to boldness-recalibrate. To only draw the contours without the colored posterior model probability surface, users can set contours_only=TRUE.

Value

If return_z = TRUE, a list with the following attributes is returned:

z

Matrix containing posterior model probabilities across k\timesk grid of uniformly spaced values of \delta and \gamma in the specified ranges dlim and glim, respectively.

dlim

Vector with bounds for \delta used to construct z.

glim

Vector with bounds for \gamma used to construct z.

k

The number of uniformly spaced \delta and \gamma values used to construct z

Setting grid for \delta and \gamma

Arguments dlim and glim are used to set the bounds of the \delta, \gamma grid and the size is dictated by argument k. Some care is required for the selection of these arguments. The goal is to determine what range of \delta and \gamma encompasses the region of non-zero posterior probabilities of calibration. However, it is not feasible to check the entire parameter space (as it is unbounded) and even at smaller regions it can be difficult to detect the region in which non-zero posterior probabilities are produced without as very dense grid (large k), as the region is often quite small relative to the entire parameter space. This is problematic, as computation time increases as k grows.

We suggest the following scheme setting k, dlim, and glim. First, fix k at some small number, less than 20 for sake of computation time. Then, center a grid with small range around the MLEs for \delta and \gamma for the given x and y. Increase the size of k until your grid detects approximated the probability of calibration at the MLEs that you expect. Then, expand your grid until it the region with high probability of calibration is covered or contract your grid to "zoom in" on the region. Then, increase k to create a fine grid of values.

Additionally, we caution users from including \gamma = 0 in the grid. This setting recalibrates all values in x to a single value which is not desirable in practice. Unless the single value is near the base rate, the set will be poorly calibrated and minimally bold, which does not align with the goal of boldness-recalibration.

Reusing matrix z via return_z

The time bottleneck for this function occurs when calculating the posterior model probabilities across the grid of parameter values. Thus it can be useful to save the resulting matrix of values to be re-used to save time when making minor cosmetic changes to your plot. If these adjustments do not change the grid bounds or density, users can set return_z=TRUE to return the underlying matrix of posterior mode probabilities for plotting. Then, instead of specifying x and y users can just pass the returned matrix as z. Note this assumes you are NOT making any changes to k, dlim, or glim. Also, it is not recommended that you construct your own matrix to pass via z as this function relies on the structure as returned by a previous call of plot_params().

Thinning

Another approach to speed up the calculations of this function is to thin the data used. However, this is generally not recommended unless the sample size is very large as the calculations of the posterior model probability may change drastically under small sample sizes. This can lead to misleading results. Under large sample sizes where thinning is used, note this is only an approximate visual of the posterior model probability.

Grid cells that show up white / round off warning message

In some cases, grid cells in the plot may show up as white instead of one of the colors from red to blue shown on the legend. A white grid cell indicates that there is no calculated posterior model probability at that cell. There are two common reasons for this: (1) that grid cell location is not covered by the z matrix used (i.e. you've adjusted the bounds without recalculating z) or (2) the values of the parameters at these locations cause the values in x to be LLO-adjusted such that they virtually equal 0 or 1. This invokes the use of epsilon to push them away from these boundaries for stability. However, in these extreme cases this can cause inaccuracies in this plot. For this reason, we either throw the warning message: "Roundoff may cause inaccuracies in upper region of plot" or allow the cell to be plotted as white to notify the user and avoid plotting artifacts.

References

Guthrie, A. P., and Franck, C. T. (2024) Boldness-Recalibration for Binary Event Predictions, The American Statistician 1-17.

Nychka, D., Furrer, R., Paige, J., Sain, S. (2021). fields: Tools for spatial data. R package version 15.2, https://github.com/dnychka/fieldsRPackage.

Examples


# Simulate 50 predicted probabilities
set.seed(49)
x <- runif(50)
# Simulated 50 binary event outcomes using x
y <- rbinom(50, 1, x)  # By construction, x is well calibrated.

#' # Set grid density k=20
plot_params(x, y, k=20)

# Adjust bounds on delta and gamma
plot_params(x, y, k=20, dlim=c(0.001, 3), glim=c(0.01,2))

# Increase grid density via k & save z matrix for faster plotting
zmat_list <- plot_params(x, y, k=100, dlim=c(0.001, 3), glim=c(0.01,2), return_z=TRUE)

# Reuse z matrix
plot_params(z=zmat_list$z, k=100, dlim=c(0.001, 3), glim=c(0.01,2))

# Add contours at t=0.95, 0.9, and 0.8
plot_params(z=zmat_list$z, k=100, dlim=c(0.001, 3), glim=c(0.01,2), t_levels=c(0.95, 0.9, 0.8))

# Add points for 95% boldness-recalibration parameters
br95 <- brcal(x, y, t=0.95, print_level=0)
plot_params(z=zmat_list$z, k=100, dlim=c(0.001, 3), glim=c(0.01,2), t_levels=c(0.95, 0.9, 0.8))
points(br95$BR_params[1], br95$BR_params[2], pch=19, col="white")

# Change color and size of contours
plot_params(z=zmat_list$z, k=100, dlim=c(0.001, 3), glim=c(0.01,2), t_levels = c(0.99, 0.1), 
contour_options=list(col="orchid", lwd=2))

# Plot contours only
plot_params(z=zmat_list$z, k=100, dlim=c(0.001, 3), glim=c(0.01,2), t_levels=c(0.95, 0.9, 0.8),
contours_only=TRUE)

# Pass arguments to image.plot()
plot_params(z=zmat_list$z, k=100, dlim=c(0.001, 3), glim=c(0.01,2),
            imgplt_options=list(horizontal = TRUE, nlevel=10, 
            legend.lab="Posterior Model Prob"))

# See vignette for more examples


[Package BRcal version 0.0.4 Index]