R: Power calculation

pow {POSSA}

R Documentation

Power calculation

Description

Calculates power and local alphas based on simulated p values (which should be provided as created by the POSSA::sim function). The calculation for sequential testing involves a staircase procedure during which an initially provided set of local alphas is continually adjusted until the (approximate) specified global type 1 error rate (e.g., global alpha = .05) is reached: the value of adjustment is decreasing while global type 1 error rate is larger than specified, and increasing while global type 1 error rate is smaller than specified; a smaller step is chosen whenever the direction (increase vs. decrease) changes; the procedure stops when the global type 1 error rate is close enough to the specified one (e.g., matches it up to 4 fractional digits) or when the specified smallest step is passed. The adjustment works via a dedicated ("adjust") function that either replaces missing (NA) values with varying alternatives or (when there are no missing values) in some manner varyingly modifies the initial values (e.g. by addition or multiplication).

Usage

pow(
  p_values,
  alpha_locals = NULL,
  alpha_global = 0.05,
  adjust = TRUE,
  adj_init = NULL,
  staircase_steps = NULL,
  alpha_precision = 5,
  fut_locals = NULL,
  multi_logic_a = "all",
  multi_logic_fut = "all",
  multi_logic_global = "any",
  group_by = NULL,
  alpha_loc_nonstop = NULL,
  round_to = 5,
  iter_limit = 100,
  seed = 8,
  prog_bar = FALSE,
  hush = FALSE
)

Arguments

`p_values`	A `data.frame` containing the simulated iterations, looks, and corresponding H0 and H1 p value outcomes, as returned by the `POSSA::sim` function. (Custom data frames are also accepted, but may not work as expected.)
`alpha_locals`	A number, a numeric vector, or a named `list` of numeric vectors, that specify the initial set of local alphas that decide on statistical significance (for interim looks as well as for the final look), and, if significant, stop the experiment at the given interim look; to be adjusted via the `adjust` function; see the `adjust` parameter below. Any of the numbers included can always be `NA` values as well (which indicates alphas to be calculated; again, see the related `adjust` parameter below). In case of a vector or a list of vectors, the length of each vector must correspond exactly to the maximum number of looks in the `p_values` data frame. When a `list` is given, the names of the list element(s) must correspond to the root of the related H0 and H1 p value column name pair(s) (in the `p_values` data frame), that is, without the "`_h0`" and "`_h1`" suffixes: for example, if the column name pair is "`p_test4_h0`" and "`p_test4_h1`", the name of the corresponding list element should be "`p_test4`". If a single number or a single numeric vector is given, all potential p value column pairs are automatically detected as starting with "`p_`" prefix and ending with "`_h0`" and "`_h1`". In case of a single vector given, each such automatically detected p value pair receives this same vector. In case of a single number given, all elements of all vectors will be assigned this same number (up to the maximum number of looks). If a list is given and any of the elements contain just one number, it will be extended into a vector (up to the maximum number of looks). The default `NULL` value specifies "fixed design" (no interim stopping alphas) with final alpha as specified as `alpha_global`, without adjustment procedure as long as the `adjust` argument is also left as default `TRUE`. (This is useful for cases where only futility bounds are to be set for stopping.)
`alpha_global`	Global alpha (expected type 1 error rate in total); `0.05` by default. See also `multi_logic_global` for when multiple p values are being evaluated.
`adjust`	The function via which the initial vector local alphas is modified with each step of the staircase procedure. Three arguments are passed to it: `adj`, `orig`, and `prev`. The `adj` parameter is mandatory; it passes the pivotal changing value that, starting from an initial value (see `adj_init`), should, via the staircase steps, decrease when the global type 1 error rate is too large, and increase when the global type 1 error rate is too small. The `orig` parameter (optional) always passes the same original vector of alphas as they were provided via `alpha_locals`. The `prev` parameter (optional) passes the "latest" vector of local alphas, which were obtained in the previous adjustment step (or, in the initial run, it is the original vector, i.e., the same as `orig`). When `TRUE` (default), if the given `alpha_locals` contains any `NA`s, an `adjust` function is given internally that simply replaces `NA`s with the varying adjustment value (as `{ prev[is.na(orig)] = adj; return(prev) }`). If `alpha_locals` contains no `NA`s, an `adjust` function is given that multiplies each original local alpha with the varying adjustment value (as `{ return(orig * adj) }`). When set to `FALSE`, there will be no adjustment (staircase procedure omitted): this is useful to calculate the global type 1 error rate for any given set of local alphas. Furthermore, if both `adjust` and `alpha_locals` are left as default (`TRUE` and `NULL`), the staircase procedure will be omitted.
`adj_init`	The initial adjustment value that is used as the "`adj`" parameter in the "`adjust`" function and is continually adjusted via the staircase steps (see `staircase_steps` parameter). When `NULL` (default), assuming that "`adj`" is used as a replacement for `NA`s, `adj_init` is calculated as the global alpha divided by the maximum number of looks (Bonferroni correction), as a rough initial approximation. However, multiplication is assumed when finding any multiplication sign (`*`) in a given custom `adjust` function: in such a case, `adj_init` will be `1` by default.
`staircase_steps`	Numeric vector that specifies the (normally decreasing) sequence of step sizes for the staircase that narrows down on the specified global error error by decreasing or increasing the adjustment value (initially: `adj_init`): the step size (numeric value) is added for increase, and subtracted for decrease. Whenever the direction (decrease vs. increase) is changed, the staircase moves on to the next step size. When the direction changes and there are no more steps remaining, the procedure is finished (regardless of the global error rate). By default (`NULL`), the `staircase_steps` is either "`0.01 * (0.5 ^ (seq(0, 11, 1)))`" (giving: `0.01, 0.005, 0.0025, ...`) or "`0.5 * (0.5 ^ (seq(0, 11, 1)))`" (giving: `0.05, 0.025, 0.0125, ...`). The latter is chosen when adjustment via multiplication is assumed, which is simply based on finding any multiplication sign (`*`) in a given custom `adjust` function. The former is chosen in any other case.
`alpha_precision`	During the staircase procedure, at any point when the simulated global type 1 error rate first matches the given `alpha_global` at least for the number of fractional digits given here (`alpha_precision`; default: `5`), the procedure stops and the results are printed. (Otherwise, the procedures finishes only when all steps given as `staircase_steps` have been used.)
`fut_locals`	Specifies local futility bounds that may stop the experiment at the given interim looks if the corresponding p value is above the given futility bound value. When `NULL` (default), sets no futility bounds. Otherwise, it follows the same logic as `alpha_locals` and has the same input possibilities (number, numeric vector, or named list of numeric vectors).
`multi_logic_a`	When multiple p values are evaluated for local alpha stopping rules, `multi_logic_a` specifies the function used for how to evaluate the multiple significance outcomes (p values being below or above the given local alphas) as a single `TRUE` or `FALSE` value that decides whether or not to stop at a given look. The default, `'all'`, specifies that all of the p values must be below the local boundary for stopping. The other acceptable character input is `'any'`, which specifies that the collection stops when any of the p values pass the boundary for stopping. Instead of these strings, the actual `all` and `any` would lead to identical outcomes, respectively, but the processing would be far slower (since the string `'all'` or `'any'` inputs specify a dedicated faster internal solution). For custom combinations, any custom function can be given, which will take, as arguments, the p value columns in their given order (either in the `p_values` data frame, or as specified in `alpha_locals`), and should return a single `TRUE` or `FALSE` value.
`multi_logic_fut`	Same as `multi_logic_a` (again with `'all'` as default), but for futility bounds (for the columns specified in `fut_locals`).
`multi_logic_global`	Similar as `multi_logic_a`, but for the calculation of the global type 1 error rate (again: in case of multiple p values being evaluated; otherwise this parameter is not relevant), and with `'any'` by default. This default means that if any of the p values under evaluation (specified via `alpha_locals` or detected automatically) is significant (p value below the given local alpha at the stopping of the simulated "experiment" iteration) in case of the H0 scenario, this is calculated as a type 1 error. If `'all'` were specified, only cases with all p evaluated values being significant are counted as type 1 errors. In either case, the ratio of outcomes with such type 1 errors (out of all iterations) gives the global type 1 error rate, which is intended to (approximately) match (via the adjustment procedure) the value specified as `alpha_global`. This global type 1 error is also what is printed to the console in the end as the "combined" global error rate. Furthermore, the logic given here is also used for the calculation of the "combined" global power printed to the console. In this case, the `'any'` logic, for example, would mean that, if any of the p values are significant at the end of the experiment, this is a positive finding. The global power is then the ratio of iterations with such positive findings.
`group_by`	When given as a character element or vector, specifies the factors by which to group the analysis: the `p_values` data will be divided into parts by these factors and these parts will be analyzed separately, with power and error information calculated per each part. By default (`NULL`), it identifies factors, if any, given to the `sim` function (via `fun_obs`) that created the given `p_values` data.
`alpha_loc_nonstop`	Optional "non-stopper" alphas via which to evaluate p values per look, but without stopping the data collection regardless of statistical significance. Must be a list with names indicating p value column name pairs, similarly as for the `alpha_locals` argument; see `alpha_locals` for details.
`round_to`	Number of fractional digits (default: `5`) to round to, for the displayed numeric information (such as alphas and power; mainly for default value for `printing`).
`iter_limit`	In some specific cases of unideal/wrong input, the staircase may get stuck at a given step's loop process. The `iter_limit` parameter specifies the number (by default `100`) at which the script pauses the loop and offers to the user that the procedure be ceased. If the user chooses to continue, the offer will always be posed again after the same number of iterations (e.g., by default, after `100`, at `200`, then `300`, etc.).
`seed`	Number for `set.seed`; `8` by default. Set to `NULL` for random seed.
`prog_bar`	Logical, `FALSE` by default. If `TRUE`, shows progress bar.
`hush`	Logical. If `TRUE`, prevents printing any details (or the progress bar) to console.

Value

The returns a list (with class "possa_pow_list") that includes all details of the calculated power, T1ER, and sample information. This list can be printed legibly (via POSSA's print() method).

Note

For the replicability, in case the adjust function uses any randomization, set.seed is executed in the beginning of this function, each time it is called; see the seed parameter.

This function uses, internally, the data.table R package.

Examples


# below is a (very) minimal example
# for more, see the vignettes via https://github.com/gasparl/possa#usage

# create sampling function
customSample = function(sampleSize) {
    list(
        sample1 = rnorm(sampleSize, mean = 0, sd = 10),
        sample2_h0 = rnorm(sampleSize, mean = 0, sd = 10),
        sample2_h1 = rnorm(sampleSize, mean = 5, sd = 10)
    )
}

# create testing function
customTest = function(sample1, sample2_h0, sample2_h1) {
 c(
   p_h0 = t.test(sample1, sample2_h0, 'less', var.equal = TRUE)$p.value,
   p_h1 = t.test(sample1, sample2_h1, 'less', var.equal = TRUE)$p.value
 )
}

# run simulation
dfPvals = sim(
    fun_obs = customSample,
    n_obs = 80,
    fun_test = customTest,
    n_iter = 1000
)

# get power info
pow(dfPvals)

[Package POSSA version 0.6.4 Index]