calibrate_trial {adaptr} | R Documentation |
Calibrate trial specification
Description
This function calibrates a trial specification using a Gaussian process-based
Bayesian optimisation algorithm.
The function calibrates an input trial specification object (using repeated
calls to run_trials()
while adjusting the trial specification) to a
target
value within a search_range
in a single input dimension (x
) in
order to find an optimal value (y
).
The default (and expectedly most common use case) is to calibrate a trial
specification to adjust the superiority
and inferiority
thresholds to
obtain a certain probability of superiority; if used with a trial
specification with identical underlying outcomes (no between-arm
differences), this probability is an estimate of the Bayesian analogue of the
total type-1 error rate for the outcome driving the adaptations, and if
between-arm differences are present, this corresponds to an estimate of the
Bayesian analogue of the power.
The default is to perform the calibration while varying single, constant,
symmetric thresholds for superiority
/ inferiority
throughout a trial
design, as described in Details, and the default values have been chosen
to function well in this case.
Advanced users may use the function to calibrate trial specifications
according to other metrics - see Details for how to specify a custom
function used to modify (or recreate) a trial specification object during
the calibration process.
The underlying Gaussian process model and its control hyperparameters are
described under Details, and the model is partially based on code from
Gramacy 2020 (with permission; see References).
Usage
calibrate_trial(
trial_spec,
n_rep = 1000,
cores = NULL,
base_seed = NULL,
fun = NULL,
target = 0.05,
search_range = c(0.9, 1),
tol = target/10,
dir = 0,
init_n = 2,
iter_max = 25,
resolution = 5000,
kappa = 0.5,
pow = 1.95,
lengthscale = 1,
scale_x = TRUE,
noisy = is.null(base_seed),
narrow = !noisy & !is.null(base_seed),
prev_x = NULL,
prev_y = NULL,
path = NULL,
overwrite = FALSE,
version = NULL,
compress = TRUE,
sparse = TRUE,
progress = NULL,
export = NULL,
export_envir = parent.frame(),
verbose = FALSE,
plot = FALSE
)
Arguments
trial_spec |
|
n_rep |
single integer, the number of simulations to run at each
evaluation. Values |
cores |
|
base_seed |
single integer or |
fun |
|
target |
single finite numeric value (defaults to |
search_range |
finite numeric vector of length |
tol |
single finite numeric value (defaults to |
dir |
single numeric value; specifies the direction(s) of the tolerance
range. If |
init_n |
single integer |
iter_max |
single integer |
resolution |
single integer (defaults to |
kappa |
single numeric value |
pow |
single numerical value in the |
lengthscale |
single numerical value (defaults to |
scale_x |
single logical value; if |
noisy |
single logical value; if |
narrow |
single logical value. If |
prev_x , prev_y |
numeric vectors of equal lengths, corresponding to
previous evaluations. If provided, these will be used in the calibration
process (added before the initial grid is setup, with values in the grid
matching values in |
path |
single character string or |
overwrite |
single logical, defaults to |
version |
passed to |
compress |
passed to |
sparse , progress , export , export_envir |
passed to |
verbose |
single logical, defaults to |
plot |
single logical, defaults to |
Details
Default calibration
If fun
is NULL
(as default), the default calibration strategy will be
employed. Here, the target y
is the probability of superiority (as
described in check_performance()
and summary()
), and the function will
calibrate constant stopping thresholds for superiority and inferiority (as
described in setup_trial()
, setup_trial_binom()
, and
setup_trial_norm()
), which corresponds to the Bayesian analogues of the
type 1 error rate if there are no differences between arms in the trial
specification, which we expect to be the most common use case, or the power,
if there are differences between arms in the trial specification.
The stopping calibration process will, in the default case, use the input x
as the stopping threshold for superiority and 1 - x
as the stopping
threshold for inferiority, respectively, i.e., stopping thresholds will be
constant and symmetric.
The underlying default function calibrated is typically essentially
noiseless if a high enough number of simulations are used with an
appropriate random base_seed
, and generally monotonically decreasing. The
default values for the control hyperparameters have been set to normally
work well in this case (including init_n
, kappa
, pow
, lengthscale
,
narrow
, scale_x
, etc.). Thus, few initial grid evaluations are used in
this case, and if a base_seed
is provided, a noiseless process is assumed
and narrowing of the search range with each iteration is performed, and the
uncertainty bounds used in the acquisition function (corresponding to
quantiles from the posterior predictive distribution) are relatively narrow.
Specifying calibration functions
A user-specified calibration function should have the following structure:
# The function must take the arguments x and trial_spec # trial_spec is the original trial_spec object which should be modified # (alternatively, it may be re-specified, but the argument should still # be included, even if ignored) function(x, trial_spec) { # Calibrate trial_spec, here as in the default function trial_spec$superiority <- x trial_spec$inferiority <- 1 - x # If relevant, known y values corresponding to specific x values may be # returned without running simulations (here done as in the default # function). In that case, a code block line the one below can be included, # with changed x/y values - of note, the other return values should not be # changed if (x == 1) { return(list(sims = NULL, trial_spec = trial_spec, y = 0)) } # Run simulations - this block should be included unchanged sims <- run_trials(trial_spec, n_rep = n_rep, cores = cores, base_seed = base_seed, sparse = sparse, progress = progress, export = export, export_envir = export_envir) # Return results - only the y value here should be changed # summary() or check_performance() will often be used here list(sims = sims, trial_spec = trial_spec, y = summary(sims)$prob_superior) }
Note: changes to the trial specification are not validated; users who
define their own calibration function need to ensure that changes to
calibrated trial specifications does not lead to invalid values; otherwise,
the procedure is prone to error when simulations are run. Especially, users
should be aware that changing true_ys
in a trial specification generated
using the simplified setup_trial_binom()
and setup_trial_norm()
functions
requires changes in multiple places in the object, including in the functions
used to generate random outcomes, and in these cases (and otherwise if in
doubt) re-generating the trial_spec
instead of modifying should be
preferred as this is safer and leads to proper validation.
Note: if the y
values corresponding to certain x
values are known,
then the user may directly return these values without running simulations
(e.g., in the default case an x
of 1
will require >100%
or <0%
probabilities for stopping rules, which is impossible, and hence the y
value in this case is by definition 1
).
Gaussian process optimisation function and control hyperparameters
The calibration function uses a relatively simple Gaussian optimisation
function with settings that should work well for the default calibration
function, but can be changed as required, which should be considered if
calibrating according to other targets (effects of using other settings may
be evaluated in greater detail by setting verbose
and plot
to TRUE
).
The function may perform both interpolation (i.e., assuming a noiseless,
deterministic process with no uncertainty at the values already evaluated) or
regression (i.e., assuming a noisy, stochastic process), controlled by the
noisy
argument.
The covariance matrix (or kernel) is defined as:
exp(-||x - x'||^pow / lengthscale)
with ||x -x'||
corresponding to a matrix containing the absolute Euclidean
distances of values of x
(and values on the prediction grid), scaled to
the [0, 1]
range if scale_x
is TRUE
and on their original scale if
FALSE
. Scaling i generally recommended (as this leads to more comparable
and predictable effects of pow
and lengthscale
, regardless of the true
scale), and also recommended if the range of values is smaller than this
range. The absolute distances are raised to the power pow
, which must be a
value in the [1, 2]
range. Together with lengthscale
, pow
controls the
smoothness of the Gaussian process model, with 1
corresponding to less
smoothing (i.e., piecewise straight lines between all evaluations if
lengthscale
is 1
) and values > 1
corresponding to more smoothing. After
raising the absolute distances to the chosen power pow
, the resulting
matrix is divided by lengthscale
. The default is 1
(no change), and
values < 1
leads to faster decay in correlations and thus less smoothing
(more wiggly fits), and values > 1
leads to more smoothing (less wiggly
fits). If a single specific value is supplied for lengthscale
this is used;
if a range of values is provided, a secondary optimisation process determines
the value to use within that range.
Some minimal noise ("jitter") is always added to the diagonals of the
matrices where relevant to ensure numerical stability; if noisy
is TRUE
,
a "nugget" value will be determined using a secondary optimisation process
Predictions will be made over an equally spaced grid of x
values of size
resolution
; if narrow
is TRUE
, this grid will only be spread out
between the x
values with corresponding y
values closest to and below and
closes to and above target
, respectively, leading to a finer grid in the
range of relevance (as described above, this should only be used for processes
that are assumed to be noiseless and should only be used if the process can
safely be assumed to be monotonically increasing or decreasing within the
search_range
). To suggest the next x
value for evaluations, the function
uses an acquisition function based on bi-directional uncertainty bounds
(posterior predictive distributions) with widths controlled by the kappa
hyperparameter. Higher kappa
/wider uncertainty bounds leads to increased
exploration (i.e., the algorithm is more prone to select values with high
uncertainty, relatively far from existing evaluations), while lower
kappa
/narrower uncertainty bounds leads to increased exploitation (i.e.,
the algorithm is more prone to select values with less uncertainty, closer to
the best predicted mean values). The value in the x
grid leading with one of
the boundaries having the smallest absolute distance to the target
is
chosen (within the narrowed range, if narrow
is TRUE
). See
Greenhill et al, 2020 under References for a general description of
acquisition functions.
IMPORTANT:
we recommend that control hyperparameters are explicitly specified, even
for the default calibration function. Although the default values should be
sensible for the default calibration function, these may change in the
future. Further, we generally recommend users to perform small-scale
comparisons (i.e., with fewer simulations than in the final calibration) of
the calibration process with different hyperparameters for specific use cases
beyond the default (possibly guided by setting the verbose
and plot
options to TRUE
) before running a substantial number of calibrations or
simulations, as the exact choices may have important influence on the speed
and likelihood of success of the calibration process.
It is the responsibility of the user to specify sensible values for the
settings and hyperparameters.
Value
A list of special class "trial_calibration"
, which contains the
following elements that can be extracted using $
or [[
:
-
success
: single logical,TRUE
if the calibration succeeded with the best result being within the tolerance range,FALSE
if the calibration process ended after all allowed iterations without obtaining a result within the tolerance range. -
best_x
: single numerical value, thex
-value (on the original, input scale) at which the besty
-value was found, regardless ofsuccess
. -
best_y
: single numerical value, the besty
-value obtained, regardless ofsuccess
. -
best_trial_spec
: the best calibrated version of the originaltrial_spec
object supplied, regardless ofsuccess
(i.e., the returned trial specification object is only adequately calibrated ifsuccess
isTRUE
). -
best_sims
: the trial simulation results (fromrun_trials()
) leading to the besty
-value, regardless ofsuccess
. If no new simulations have been conducted (e.g., if the besty
-value is from one of theprev_y
-values), this will beNULL
. -
evaluations
: a two-columndata.frame
containing the variablesx
andy
, corresponding to allx
-values andy
-values (including values supplied throughprev_x
/prev_y
). -
input_trial_spec
: the unaltered, uncalibrated, originaltrial_spec
-object provided to the function. -
elapsed_time
: the total run time of the calibration process. -
control
: list of the most central settings provided to the function. -
fun
: the function used for calibration; ifNULL
was supplied when starting the calibration, the default function (described in Details) is returned after being used in the function. -
adaptr_version
: the version of theadaptr
package used to run the calibration process. -
plots
: list containingggplot2
plot objects of each Gaussian process suggestion step, only included ifplot
isTRUE
.
References
Gramacy RB (2020). Chapter 5: Gaussian Process Regression. In: Surrogates: Gaussian Process Modeling, Design and Optimization for the Applied Sciences. Chapman Hall/CRC, Boca Raton, Florida, USA. Available online.
Greenhill S, Rana S, Gupta S, Vellanki P, Venkatesh S (2020). Bayesian Optimization for Adaptive Experimental Design: A Review. IEEE Access, 8, 13937-13948. doi:10.1109/ACCESS.2020.2966228
Examples
## Not run:
# Setup a trial specification to calibrate
# This trial specification has similar event rates in all arms
# and as the default calibration settings are used, this corresponds to
# assessing the Bayesian type 1 error rate for this design and scenario
binom_trial <- setup_trial_binom(arms = c("A", "B"),
true_ys = c(0.25, 0.25),
data_looks = 1:5 * 200)
# Run calibration using default settings for most parameters
res <- calibrate_trial(binom_trial, n_rep = 1000, base_seed = 23)
# Print calibration summary result
res
## End(Not run)