refine {optrefine} | R Documentation |
Refine initial stratification
Description
Refine an initial stratification by splitting each stratum or specified subset of strata
into two refined strata. If no initial
stratification is provided, one is first generated
using prop_strat()
.
Usage
refine(object = NULL, z = NULL, X = NULL, strata = NULL, options = list())
Arguments
object |
an optional object of class |
z |
vector of treatment assignment; only used if |
X |
covariate matrix/data.frame; only used if |
strata |
vector of initial strata assignments; only used if |
options |
list containing various options described in the |
Details
The options
argument can contain any of the following elements:
solver: character specifying the optimization software to use. Options are "Rglpk" or "gurobi". The default is "Rglpk" unless a gurobi installation is detected, in which case it is set to "gurobi". It is recommended to use "gurobi" if available.
standardize: boolean whether or not to standardize the covariates in X. Default is
TRUE
criterion: which optimization criterion to use. Options are "max", "sum", or "combo", referring to whether to optimize the maximum standardized mean difference (SMD), the sum of all SMDs, or a combination of the maximum and the sum. The default is "combo"
integer: boolean whether to use integer programming as opposed to randomized rounding of linear programs. Note that setting this to
TRUE
may cause this function to never finish depending on the size of the data and is not recommended except for tiny data setswMax: how much to weight the maximum standardized mean difference compared to the sum. Only used if criterion is set to "combo". Default is 5
ist: which strata to split. Should be a level from the specified
strata
or a vector of multiple levels. Default is to split all strataminsplit: The minimum number of treated and control units to allow in a refined stratum. Default is 10
threads: How many threads you'd like the optimization to use if using the "gurobi" solver. Uses all available threads by default
Note that setting a seed before using this function will ensure that the results are reproducible on the same machine, but results may vary across machines due to how the optimization solvers work.
Value
Object of class "strat", which is a list object with the following components:
z: treatment vector
X: covariate matrix
base_strata: initial stratification
refined_strata: refined_stratification
details: various details about the optimization that can be ignored in practice, but may be interesting:
valueIP, valueLP: integer (determined via randomized rounding, unless
integer
option set to true) and linear programming scaled objective valuesn_fracs: number of units with fractional LP solutions
rand_c_prop, rand_t_prop: proportions of the control and treated units in each stratum that were selected with randomness
pr: linear programming solution, with rows corresponding to the strata and columns to the units
criterion: criterion used in the optimization (see the
details
about theoptions
for the optimization)wMax: weight placed on the maximum standardized mean difference in the optimization (see the
details
about theoptions
for the optimization)X_std: standardized version of
X
Examples
# Choose 400 patients and 4 covariates to work with for the example
set.seed(15)
samp <- sample(1:nrow(rhc_X), 400)
cov_samp <- sample(1:26, 4)
# Let it create propensity score strata for you and then refine them
ref <- refine(X = rhc_X[samp, cov_samp], z = rhc_X[samp, "z"])
# Or, specify your own initial strata
ps <- prop_strat(z = rhc_X[samp, "z"],
X = rhc_X[samp, cov_samp], nstrata = 3)
ref <- refine(X = ps$X, z = ps$z, strata = ps$base_strata)
# Can just input the output of prop_strat() directly
ref <- refine(object = ps)