calibrate_trial {adaptr}  R Documentation 
This function calibrates a trial specification using a Gaussian processbased
Bayesian optimisation algorithm.
The function calibrates an input trial specification object (using repeated
calls to run_trials()
while adjusting the trial specification) to a
target
value within a search_range
in a single input dimension (x
) in
order to find an optimal value (y
).
The default (and expectedly most common use case) is to calibrate a trial
specification to adjust the superiority
and inferiority
thresholds to
obtain a certain probability of superiority; if used with a trial
specification with identical underlying outcomes (no betweenarm
differences), this probability is an estimate of the Bayesian analogue of the
total type1 error rate for the outcome driving the adaptations, and if
betweenarm differences are present, this corresponds to an estimate of the
Bayesian analogue of the power.
The default is to perform the calibration while varying single, constant,
symmetric thresholds for superiority
/ inferiority
throughout a trial
design, as described in Details, and the default values have been chosen
to function well in this case.
Advanced users may use the function to calibrate trial specifications
according to other metrics  see Details for how to specify a custom
function used to modify (or recreate) a trial specification object during
the calibration process.
The underlying Gaussian process model and its control hyperparameters are
described under Details, and the model is partially based on code from
Gramacy 2020 (with permission; see References).
calibrate_trial(
trial_spec,
n_rep = 1000,
cores = NULL,
base_seed = NULL,
fun = NULL,
target = 0.05,
search_range = c(0.9, 1),
tol = target/10,
dir = 0,
init_n = 2,
iter_max = 25,
resolution = 5000,
kappa = 0.5,
pow = 1.95,
lengthscale = 1,
scale_x = TRUE,
noisy = is.null(base_seed),
narrow = !noisy & !is.null(base_seed),
prev_x = NULL,
prev_y = NULL,
path = NULL,
overwrite = FALSE,
version = NULL,
compress = TRUE,
sparse = TRUE,
progress = NULL,
export = NULL,
export_envir = parent.frame(),
verbose = FALSE,
plot = FALSE
)
trial_spec 

n_rep 
single integer, the number of simulations to run at each
evaluation. Values 
cores 

base_seed 
single integer or 
fun 

target 
single finite numeric value (defaults to 
search_range 
finite numeric vector of length 
tol 
single finite numeric value (defaults to 
dir 
single numeric value; specifies the direction(s) of the tolerance
range. If 
init_n 
single integer 
iter_max 
single integer 
resolution 
single integer (defaults to 
kappa 
single numeric value 
pow 
single numerical value in the 
lengthscale 
single numerical value (defaults to 
scale_x 
single logical value; if 
noisy 
single logical value; if 
narrow 
single logical value. If 
prev_x , prev_y 
numeric vectors of equal lengths, corresponding to
previous evaluations. If provided, these will be used in the calibration
process (added before the initial grid is setup, with values in the grid
matching values in 
path 
single character string or 
overwrite 
single logical, defaults to 
version 
passed to 
compress 
passed to 
sparse , progress , export , export_envir 
passed to 
verbose 
single logical, defaults to 
plot 
single logical, defaults to 
Default calibration
If fun
is NULL
(as default), the default calibration strategy will be
employed. Here, the target y
is the probability of superiority (as
described in check_performance()
and summary()
), and the function will
calibrate constant stopping thresholds for superiority and inferiority (as
described in setup_trial()
, setup_trial_binom()
, and
setup_trial_norm()
), which corresponds to the Bayesian analogues of the
type 1 error rate if there are no differences between arms in the trial
specification, which we expect to be the most common use case, or the power,
if there are differences between arms in the trial specification.
The stopping calibration process will, in the default case, use the input x
as the stopping threshold for superiority and 1  x
as the stopping
threshold for inferiority, respectively, i.e., stopping thresholds will be
constant and symmetric.
The underlying default function calibrated is typically essentially
noiseless if a high enough number of simulations are used with an
appropriate random base_seed
, and generally monotonically decreasing. The
default values for the control hyperparameters have been set to normally
work well in this case (including init_n
, kappa
, pow
, lengthscale
,
narrow
, scale_x
, etc.). Thus, few initial grid evaluations are used in
this case, and if a base_seed
is provided, a noiseless process is assumed
and narrowing of the search range with each iteration is performed, and the
uncertainty bounds used in the acquisition function (corresponding to
quantiles from the posterior predictive distribution) are relatively narrow.
Specifying calibration functions
A userspecified calibration function should have the following structure:
# The function must take the arguments x and trial_spec # trial_spec is the original trial_spec object which should be modified # (alternatively, it may be respecified, but the argument should still # be included, even if ignored) function(x, trial_spec) { # Calibrate trial_spec, here as in the default function trial_spec$superiority < x trial_spec$inferiority < 1  x # If relevant, known y values corresponding to specific x values may be # returned without running simulations (here done as in the default # function). In that case, a code block line the one below can be included, # with changed x/y values  of note, the other return values should not be # changed if (x == 1) { return(list(sims = NULL, trial_spec = trial_spec, y = 0)) } # Run simulations  this block should be included unchanged sims < run_trials(trial_spec, n_rep = n_rep, cores = cores, base_seed = base_seed, sparse = sparse, progress = progress, export = export, export_envir = export_envir) # Return results  only the y value here should be changed # summary() or check_performance() will often be used here list(sims = sims, trial_spec = trial_spec, y = summary(sims)$prob_superior) }
Note: changes to the trial specification are not validated; users who
define their own calibration function need to ensure that changes to
calibrated trial specifications does not lead to invalid values; otherwise,
the procedure is prone to error when simulations are run. Especially, users
should be aware that changing true_ys
in a trial specification generated
using the simplified setup_trial_binom()
and setup_trial_norm()
functions
requires changes in multiple places in the object, including in the functions
used to generate random outcomes, and in these cases (and otherwise if in
doubt) regenerating the trial_spec
instead of modifying should be
preferred as this is safer and leads to proper validation.
Note: if the y
values corresponding to certain x
values are known,
then the user may directly return these values without running simulations
(e.g., in the default case an x
of 1
will require >100%
or <0%
probabilities for stopping rules, which is impossible, and hence the y
value in this case is by definition 1
).
Gaussian process optimisation function and control hyperparameters
The calibration function uses a relatively simple Gaussian optimisation
function with settings that should work well for the default calibration
function, but can be changed as required, which should be considered if
calibrating according to other targets (effects of using other settings may
be evaluated in greater detail by setting verbose
and plot
to TRUE
).
The function may perform both interpolation (i.e., assuming a noiseless,
deterministic process with no uncertainty at the values already evaluated) or
regression (i.e., assuming a noisy, stochastic process), controlled by the
noisy
argument.
The covariance matrix (or kernel) is defined as:
exp(x  x'^pow / lengthscale)
with x x'
corresponding to a matrix containing the absolute Euclidean
distances of values of x
(and values on the prediction grid), scaled to
the [0, 1]
range if scale_x
is TRUE
and on their original scale if
FALSE
. Scaling i generally recommended (as this leads to more comparable
and predictable effects of pow
and lengthscale
, regardless of the true
scale), and also recommended if the range of values is smaller than this
range. The absolute distances are raised to the power pow
, which must be a
value in the [1, 2]
range. Together with lengthscale
, pow
controls the
smoothness of the Gaussian process model, with 1
corresponding to less
smoothing (i.e., piecewise straight lines between all evaluations if
lengthscale
is 1
) and values > 1
corresponding to more smoothing. After
raising the absolute distances to the chosen power pow
, the resulting
matrix is divided by lengthscale
. The default is 1
(no change), and
values < 1
leads to faster decay in correlations and thus less smoothing
(more wiggly fits), and values > 1
leads to more smoothing (less wiggly
fits). If a single specific value is supplied for lengthscale
this is used;
if a range of values is provided, a secondary optimisation process determines
the value to use within that range.
Some minimal noise ("jitter") is always added to the diagonals of the
matrices where relevant to ensure numerical stability; if noisy
is TRUE
,
a "nugget" value will be determined using a secondary optimisation process
Predictions will be made over an equally spaced grid of x
values of size
resolution
; if narrow
is TRUE
, this grid will only be spread out
between the x
values with corresponding y
values closest to and below and
closes to and above target
, respectively, leading to a finer grid in the
range of relevance (as described above, this should only be used for processes
that are assumed to be noiseless and should only be used if the process can
safely be assumed to be monotonically increasing or decreasing within the
search_range
). To suggest the next x
value for evaluations, the function
uses an acquisition function based on bidirectional uncertainty bounds
(posterior predictive distributions) with widths controlled by the kappa
hyperparameter. Higher kappa
/wider uncertainty bounds leads to increased
exploration (i.e., the algorithm is more prone to select values with high
uncertainty, relatively far from existing evaluations), while lower
kappa
/narrower uncertainty bounds leads to increased exploitation (i.e.,
the algorithm is more prone to select values with less uncertainty, closer to
the best predicted mean values). The value in the x
grid leading with one of
the boundaries having the smallest absolute distance to the target
is
chosen (within the narrowed range, if narrow
is TRUE
). See
Greenhill et al, 2020 under References for a general description of
acquisition functions.
IMPORTANT:
we recommend that control hyperparameters are explicitly specified, even
for the default calibration function. Although the default values should be
sensible for the default calibration function, these may change in the
future. Further, we generally recommend users to perform smallscale
comparisons (i.e., with fewer simulations than in the final calibration) of
the calibration process with different hyperparameters for specific use cases
beyond the default (possibly guided by setting the verbose
and plot
options to TRUE
) before running a substantial number of calibrations or
simulations, as the exact choices may have important influence on the speed
and likelihood of success of the calibration process.
It is the responsibility of the user to specify sensible values for the
settings and hyperparameters.
A list of special class "trial_calibration"
, which contains the
following elements that can be extracted using $
or [[
:
success
: single logical, TRUE
if the calibration succeeded with
the best result being within the tolerance range, FALSE
if the
calibration process ended after all allowed iterations without
obtaining a result within the tolerance range.
best_x
: single numerical value, the x
value (on the original,
input scale) at which the best y
value was found, regardless of
success
.
best_y
: single numerical value, the best y
value obtained,
regardless of success
.
best_trial_spec
: the best calibrated version of the original
trial_spec
object supplied, regardless of success
(i.e., the
returned trial specification object is only adequately calibrated if
success
is TRUE
).
best_sims
: the trial simulation results (from run_trials()
)
leading to the best y
value, regardless of success
. If no new
simulations have been conducted (e.g., if the best y
value is from
one of the prev_y
values), this will be NULL
.
evaluations
: a twocolumn data.frame
containing the variables
x
and y
, corresponding to all x
values and y
values (including
values supplied through prev_x
/prev_y
).
input_trial_spec
: the unaltered, uncalibrated, original
trial_spec
object provided to the function.
elapsed_time
: the total run time of the calibration process.
control
: list of the most central settings provided to the
function.
fun
: the function used for calibration; if NULL
was supplied
when starting the calibration, the default function (described in
Details) is returned after being used in the function.
adaptr_version
: the version of the adaptr
package used to run
the calibration process.
plots
: list containing ggplot2
plot objects of each Gaussian
process suggestion step, only included if plot
is TRUE
.
Gramacy RB (2020). Chapter 5: Gaussian Process Regression. In: Surrogates: Gaussian Process Modeling, Design and Optimization for the Applied Sciences. Chapman Hall/CRC, Boca Raton, Florida, USA. Available online.
Greenhill S, Rana S, Gupta S, Vellanki P, Venkatesh S (2020). Bayesian Optimization for Adaptive Experimental Design: A Review. IEEE Access, 8, 1393713948. doi:10.1109/ACCESS.2020.2966228
## Not run:
# Setup a trial specification to calibrate
# This trial specification has similar event rates in all arms
# and as the default calibration settings are used, this corresponds to
# assessing the Bayesian type 1 error rate for this design and scenario
binom_trial < setup_trial_binom(arms = c("A", "B"),
true_ys = c(0.25, 0.25),
data_looks = 1:5 * 200)
# Run calibration using default settings for most parameters
res < calibrate_trial(binom_trial, n_rep = 1000, base_seed = 23)
# Print calibration summary result
res
## End(Not run)