R: Generate P-values using empirical randomization null...

rand_pvals {optrefine}

R Documentation

Generate P-values using empirical randomization null distribution

Description

Randomize the treatment assignment within strata to generate the randomization distribution of covariate balance given the strata and observed covariate values. Compare the observed covariate balance to this null distribution to calculate P-values.

Usage

rand_pvals(
  object = NULL,
  z = NULL,
  X = NULL,
  base_strata = NULL,
  refined_strata = NULL,
  options = list()
)

Arguments

`object`	an optional object of class `strat`, typically created using `strat()` or as a result of a call to `prop_strat()` or `refine()`. If not provided, `z` and `X` must be specified
`z`	vector of treatment assignment; only used if `object` is not supplied
`X`	covariate matrix/data.frame; only used if `object` is not supplied
`base_strata`	optional initial stratification for which to calculate the empirical randomization null distribution; only used if `object` is not supplied
`refined_strata`	optional refined stratification for which to calculate the empirical randomization null distribution; only used if `object` is not supplied
`options`	list of additional options, listed in the `details` below

Details

The literature on multivariate matching has recently developed a new way of evaluating covariate imbalances, comparing the imbalances found in an observational matched sample to the imbalances that would have been produced in the same data by randomization (Pimentel et al. 2015, Yu 2021). We modify that approach for use with strata, randomizing patients within strata. For a given stratification, we create a large number of stratified randomized experiments, taking the actual patients in their actual strata, and randomizing them to treatment or control with fixed within-stratum sample sizes.

To investigate how the actual observational imbalance in covariates compares to covariate imbalance in the randomized experiments built from the same strata, patients and covariates, we look at 4 metrics– the scaled objective value, which is a weighted combination of the maximum and the sum of all SMDs, depending on the criterion argument, the maximum and average SMDs across covariates and strata, and the average SMD across strata for each covariate individually. For each of these metrics, we record the observational value, the median and minimum of the randomized values, and the proportion of randomized values more imbalanced than the observational value (the P-value).

The options list argument can contain any of the following elements:

nrand: how many times to randomize the treatment assignment when forming the null distribution. Default is 10000
criterion: which optimization criterion to use when calculating the objective value. Options are "max", "sum", or "combo", referring to whether to include the maximum standardized mean difference (SMD), the sum of all SMDs, or a combination of the maximum and the sum. The default is "combo"
wMax: how much to weight the maximum standardized mean difference compared to the sum. Only used if criterion is set to "combo". Default is 5
incl_base: whether to include columns for the initial stratification in the table. Default is TRUE if a base stratification is provided

Value

List with three components:

pvals: list containing base and refined elements, each of which is a list with randomization P-values for the objective value (NULL for the base stratification), the maximum standardized mean difference (SMD), the average SMD across covariates and strata, and for the average SMD across strata for each covariate (this element is a vector)
obs_details: list containing base and refined elements, each of which is a list with the observed values for the objective value (NULL for the base stratification), the maximum standardized mean difference (SMD), and for the average SMD across strata for each covariate (this element is a vector)
rand_details: list containing base and refined elements, each of which is a list with a vector of nrand randomized values for the objective value (NULL for the base stratification), the maximum standardized mean difference (SMD), and for the average SMD across strata for each covariate (this element is a matrix with nrand rows and a column for each covariate)

Examples

# Choose 500 patients and 5 covariates to work with for the example
set.seed(15)
samp <- sample(1:nrow(rhc_X), 500)
cov_samp <- sample(1:26, 5)

# Let it create propensity score strata for you and then refine them
ref <- refine(X = rhc_X[samp, cov_samp], z = rhc_X[samp, "z"])

# Calculate info for covariate balance randomization distribution
rpvals <- rand_pvals(object = ref, options = list(nrand = 100))

# Look at pvals before and after
rpvals$pvals

[Package optrefine version 1.1.0 Index]