check_performance {adaptr}R Documentation

Check performance metrics for trial simulations


Calculates performance metrics for a trial specification based on simulation results from the run_trials() function, with bootstrapped uncertainty measures if requested. Uses extract_results(), which may be used directly to extract key trial results without summarising. This function is also used by summary() to calculate the performance metrics presented by that function.


  select_strategy = "control if available",
  select_last_arm = FALSE,
  select_preferences = NULL,
  te_comp = NULL,
  raw_ests = FALSE,
  final_ests = NULL,
  restrict = NULL,
  uncertainty = FALSE,
  n_boot = 5000,
  ci_width = 0.95,
  boot_seed = NULL,
  cores = NULL



trial_results object, output from the run_trials() function.


single character string. If a trial was not stopped due to superiority (or had only 1 arm remaining, if select_last_arm is set to TRUE in trial designs with a common control arm; see below), this parameter specifies which arm will be considered selected when calculating trial design performance metrics, as described below; this corresponds to the consequence of an inconclusive trial, i.e., which arm would then be used in practice.
The following options are available and must be written exactly as below (case sensitive, cannot be abbreviated):

  • "control if available" (default): selects the first control arm for trials with a common control arm if this arm is active at end-of-trial, otherwise no arm will be selected. For trial designs without a common control, no arm will be selected.

  • "none": selects no arm in trials not ending with superiority.

  • "control": similar to "control if available", but will throw an error if used for trial designs without a common control arm.

  • "final control": selects the final control arm regardless of whether the trial was stopped for practical equivalence, futility, or at the maximum sample size; this strategy can only be specified for trial designs with a common control arm.

  • "control or best": selects the first control arm if still active at end-of-trial, otherwise selects the best remaining arm (defined as the remaining arm with the highest probability of being the best in the last adaptive analysis conducted). Only works for trial designs with a common control arm.

  • "best": selects the best remaining arm (as described under "control or best").

  • "list or best": selects the first remaining arm from a specified list (specified using select_preferences, technically a character vector). If none of these arms are are active at end-of-trial, the best remaining arm will be selected (as described above).

  • "list": as specified above, but if no arms on the provided list remain active at end-of-trial, no arm is selected.


single logical, defaults to FALSE. If TRUE, the only remaining active arm (the last control) will be selected in trials with a common control arm ending with equivalence or futility, before considering the options specified in select_strategy. Must be FALSE for trial designs without a common control arm.


character vector specifying a number of arms used for selection if one of the "list or best" or "list" options are specified for select_strategy. Can only contain valid arms available in the trial.


character string, treatment-effect comparator. Can be either NULL (the default) in which case the first control arm is used for trial designs with a common control arm, or a string naming a single trial arm. Will be used when calculating sq_err_te (the squared error of the treatment effect comparing the selected arm to the comparator arm, as described below).


single logical. If FALSE (default), the posterior estimates (post_ests or post_ests_all, see setup_trial() and run_trial()) will be used to calculate sq_err (the squared error of the estimated compared to the specified effect in the selected arm) and sq_err_te (the squared error of the treatment effect comparing the selected arm to the comparator arm, as described for te_comp and below). If TRUE, the raw estimates (raw_ests or raw_ests_all, see setup_trial() and run_trial()) will be used instead of the posterior estimates.


single logical. If TRUE (recommended) the final estimates calculated using outcome data from all patients randomised when trials are stopped are used (post_ests_all or raw_ests_all, see setup_trial() and run_trial()); if FALSE, the estimates calculated for each arm when an arm is stopped (or at the last adaptive analysis if not before) using data from patients having reach followed up at this time point and not all patients randomised are used (post_ests or raw_ests, see setup_trial() and run_trial()). If NULL (the default), this argument will be set to FALSE if outcome data are available immediate after randomisation for all patients (for backwards compatibility, as final posterior estimates may vary slightly in this situation, even if using the same data); otherwise it will be said to TRUE. See setup_trial() for more details on how these estimates are calculated.


single character string or NULL. If NULL (default), results are summarised for all simulations; if "superior", results are summarised for simulations ending with superiority only; if "selected", results are summarised for simulations ending with a selected arm only (according to the specified arm selection strategy for simulations not ending with superiority). Some summary measures (e.g., prob_conclusive) have substantially different interpretations if restricted, but are calculated nonetheless.


single logical; if FALSE (default) uncertainty measures are not calculated, if TRUE, non-parametric bootstrapping is used to calculate uncertainty measures.


single integer (default 5000); the number of bootstrap samples to use if uncertainty = TRUE. Values ⁠< 100⁠ are not allowed and values ⁠< 1000⁠ will lead to a warning, as results are likely to be unstable in those cases.


single numeric ⁠>= 0⁠ and ⁠< 1⁠, the width of the percentile-based bootstrapped confidence intervals. Defaults to 0.95, corresponding to 95% confidence intervals.


single integer, NULL (default), or "base". If a value is provided, this value will be used to initiate random seeds when bootstrapping with the global random seed restored after the function has run. If "base" is specified, the base_seed specified in run_trials() is used. Regardless of whether simulations are run sequentially or in parallel, bootstrapped results will be identical if a boot_seed is specified.


NULL or single integer. If NULL, a default value set by setup_cluster() will be used to control whether extractions of simulation results are done in parallel on a default cluster or sequentially in the main process; if a value has not been specified by setup_cluster(), cores will then be set to the value stored in the global "mc.cores" option (if previously set by ⁠options(mc.cores = <number of cores>⁠), and 1 if that option has not been specified.
If cores = 1, computations will be run sequentially in the primary process, and if cores > 1, a new parallel cluster will be setup using the parallel library and removed once the function completes. See setup_cluster() for details.


The ideal design percentage (IDP) returned is based on Viele et al, 2020 doi:10.1177/1740774519877836 (and also described in Granholm et al, 2022 doi:10.1016/j.jclinepi.2022.11.002, which also describes the other performance measures) and has been adapted to work for trials with both desirable/undesirable outcomes and non-binary outcomes. Briefly, the expected outcome is calculated as the sum of the true outcomes in each arm multiplied by the corresponding selection probabilities (ignoring simulations with no selected arm). The IDP is then calculated as:


A tidy data.frame with added class trial_performance (to control the number of digits printed, see print()), with the columns "metric" (described below), "est" (estimate of each metric), and the following four columns if uncertainty = TRUE: "err_sd"(bootstrapped SDs), "err_mad" (bootstrapped MAD-SDs, as described in setup_trial() and stats::mad()), "lo_ci", and "hi_ci", the latter two corresponding to the lower/upper limits of the percentile-based bootstrapped confidence intervals. Bootstrap estimates are not calculated for the mininum (⁠_p0⁠) and maximum values (⁠_p100⁠) of size, sum_ys, and ratio_ys, as non-parametric bootstrapping for mininum/maximum values is not sensible - bootstrap estimates for these values will be NA.
The following performance metrics are calculated:

See Also

extract_results(), summary(), plot_convergence(), plot_metrics_ecdf(), check_remaining_arms().


# Setup a trial specification
binom_trial <- setup_trial_binom(arms = c("A", "B", "C", "D"),
                                 control = "A",
                                 true_ys = c(0.20, 0.18, 0.22, 0.24),
                                 data_looks = 1:20 * 100)

# Run 10 simulations with a specified random base seed
res <- run_trials(binom_trial, n_rep = 10, base_seed = 12345)

# Check performance measures, without assuming that any arm is selected in
# the inconclusive simulations, with bootstrapped uncertainty measures
# (unstable in this example due to the very low number of simulations
# summarised):
check_performance(res, select_strategy = "none", uncertainty = TRUE,
n_boot = 1000, boot_seed = "base")

[Package adaptr version 1.3.2 Index]