best_split {optrefine} | R Documentation |
Find the best split for a stratum
Description
Runs split_stratum()
many times and selects the best result.
Usage
best_split(
z,
X,
strata,
ist,
nc_list,
nt_list,
wMax = 5,
wEach = 1,
solver = "Rglpk",
integer = FALSE,
min_split = 10,
threads = threads
)
Arguments
z |
Vector of treatment assignment |
X |
Covariate matrix or data.frame |
strata |
vector of initial strata assignments; only used if |
ist |
the stratum to be split |
nc_list |
a list of choices for the |
nt_list |
a list of choices for the |
wMax |
the weight the objective places on the maximum epsilon |
wEach |
the weight the objective places on each epsilon |
solver |
character specifying the optimization software to use. Options are "Rglpk" or "gurobi". The default is "gurobi" |
integer |
boolean whether to use integer programming instead of randomized rounding.
Default is |
min_split |
a numeric specifying the minimum number of each control and treated units
to be tolerated in a stratum. Any combination of elements
from |
threads |
how many threads to use in the optimization if using "gurobi" as the solver. Default will use all available threads |
Value
A list containing the following elements:
valuesIP, valuesLP: matrices containing integer and linear programming scaled objective values for each sample size tried, with rows corresponding to the elements of
nc_list
and columns corresponding to the elements ofnt_list
besti, bestj: indices of the best sample sizes in
nc_list
and innt_list
, respectivelyn_smds: number of standardized mean differences contributing to the objective values (multiply the scaled objective values by this number to get the true objective values)
n_fracs: number of units with fractional LP solutions in the best split
rand_c_prop, rand_t_prop: proportions of the control and treated units in each stratum that were selected with randomness for the best split
pr: linear programming solution for the best split, with rows corresponding to the strata and columns to the units
selection: vector of selected strata for each unit in the initial stratum to be split for the best split
Examples
# Generate a small data set
set.seed(25)
samp <- sample(1:nrow(rhc_X), 1000)
cov_samp <- sample(1:26, 10)
# Create some strata
ps <- prop_strat(z = rhc_X[samp, "z"],
X = rhc_X[samp, cov_samp], nstrata = 5)
# Save the sample sizes
tab <- table(ps$z, ps$base_strata)
# Choose the best sample sizes among the options provided
best_split(z = ps$z, X = ps$X, strata = ps$base_strata, ist = 1,
nc_list = list(c(floor(tab[1, 1] * 0.25), ceiling(tab[1, 1] * 0.75)),
c(floor(tab[1, 1] * 0.4), ceiling(tab[1, 1] * 0.6))),
nt_list = list(c(floor(tab[2, 1] * 0.3), ceiling(tab[2, 1] * 0.7))),
min_split = 5)