R: Refine initial stratification

refine {optrefine}

R Documentation

Refine initial stratification

Description

Refine an initial stratification by splitting each stratum or specified subset of strata into two refined strata. If no initial stratification is provided, one is first generated using prop_strat().

Usage

refine(object = NULL, z = NULL, X = NULL, strata = NULL, options = list())

Arguments

`object`	an optional object of class `strat`, typically created using `strat()` or as a result of a call to `prop_strat()`. If not provided, `z` and `X` must be specified
`z`	vector of treatment assignment; only used if `object` is not supplied
`X`	covariate matrix/data.frame; only used if `object` is not supplied
`strata`	vector of initial strata assignments; only used if `object` is not supplied. Can be `NULL`, in which case an initial stratification using the quintiles of the propensity score is generated using `prop_strat()` and the generated propensity score is also added to the X matrix as an extra covariate
`options`	list containing various options described in the `Details` below

Details

The options argument can contain any of the following elements:

solver: character specifying the optimization software to use. Options are "Rglpk" or "gurobi". The default is "Rglpk" unless a gurobi installation is detected, in which case it is set to "gurobi". It is recommended to use "gurobi" if available.
standardize: boolean whether or not to standardize the covariates in X. Default is TRUE
criterion: which optimization criterion to use. Options are "max", "sum", or "combo", referring to whether to optimize the maximum standardized mean difference (SMD), the sum of all SMDs, or a combination of the maximum and the sum. The default is "combo"
integer: boolean whether to use integer programming as opposed to randomized rounding of linear programs. Note that setting this to TRUE may cause this function to never finish depending on the size of the data and is not recommended except for tiny data sets
wMax: how much to weight the maximum standardized mean difference compared to the sum. Only used if criterion is set to "combo". Default is 5
ist: which strata to split. Should be a level from the specified strata or a vector of multiple levels. Default is to split all strata
minsplit: The minimum number of treated and control units to allow in a refined stratum. Default is 10
threads: How many threads you'd like the optimization to use if using the "gurobi" solver. Uses all available threads by default

Note that setting a seed before using this function will ensure that the results are reproducible on the same machine, but results may vary across machines due to how the optimization solvers work.

Value

Object of class "strat", which is a list object with the following components:

z: treatment vector
X: covariate matrix
base_strata: initial stratification
refined_strata: refined_stratification
details: various details about the optimization that can be ignored in practice, but may be interesting:
- valueIP, valueLP: integer (determined via randomized rounding, unless integer option set to true) and linear programming scaled objective values
- n_fracs: number of units with fractional LP solutions
- rand_c_prop, rand_t_prop: proportions of the control and treated units in each stratum that were selected with randomness
- pr: linear programming solution, with rows corresponding to the strata and columns to the units
- criterion: criterion used in the optimization (see the details about the options for the optimization)
- wMax: weight placed on the maximum standardized mean difference in the optimization (see the details about the options for the optimization)
- X_std: standardized version of X

Examples

# Choose 400 patients and 4 covariates to work with for the example
set.seed(15)
samp <- sample(1:nrow(rhc_X), 400)
cov_samp <- sample(1:26, 4)

# Let it create propensity score strata for you and then refine them
ref <- refine(X = rhc_X[samp, cov_samp], z = rhc_X[samp, "z"])

# Or, specify your own initial strata
ps <- prop_strat(z = rhc_X[samp, "z"],
                 X = rhc_X[samp, cov_samp], nstrata = 3)
ref <- refine(X = ps$X, z = ps$z, strata = ps$base_strata)

# Can just input the output of prop_strat() directly
ref <- refine(object = ps)

[Package optrefine version 1.1.0 Index]