R: Split one stratum into multiple strata

split_stratum {optrefine}

R Documentation

Split one stratum into multiple strata

Description

Split one stratum into multiple with specified sample sizes.

Usage

split_stratum(
  z,
  X,
  strata,
  ist,
  nc,
  nt,
  wMax = 5,
  wEach = 1,
  solver = "Rglpk",
  integer = FALSE,
  threads = NULL
)

Arguments

`z`	Vector of treatment assignment
`X`	Covariate matrix or data.frame
`strata`	vector of initial strata assignments; only used if `object` is not supplied. Can be `NULL`, in which case an initial stratification using the quintiles of the propensity score is generated using `prop_strat()` and the generated propensity score is also added to the X matrix as an extra covariate
`ist`	the stratum to be split
`nc`	a vector stating how many control units to place in each of the new split strata. The sum must be the total number of controls in the stratum to be split
`nt`	a vector stating how many treated units to place in each of the new split strata. The sum must be the total number of treated units in the stratum to be split
`wMax`	the weight the objective places on the maximum epsilon
`wEach`	the weight the objective places on each epsilon
`solver`	character specifying the optimization software to use. Options are "Rglpk" or "gurobi". The default is "gurobi"
`integer`	boolean whether to use integer programming instead of randomized rounding. Default is `FALSE`. It is not recommended to set this to `TRUE` as the problem may never finish
`threads`	how many threads to use in the optimization if using "gurobi" as the solver. Default will use all available threads

Value

A list containing the following elements:

valueIP, valueLP: integer and linear programming scaled objective values
n_smds: number of standardized mean differences contributing to the objective values (multiply the scaled objective values by this number to get the true objective values)
n_fracs: the number of units with fractional linear programming solutions
rand_c_prop, rand_t_prop: proportions of the control and treated units in each stratum that were selected with randomness
pr: linear programming solution, with rows corresponding to the strata and columns to the units
selection: vector of selected strata for each unit in the initial stratum to be split

Examples

# Generate a small data set
set.seed(25)
samp <- sample(1:nrow(rhc_X), 1000)
cov_samp <- sample(1:26, 10)

# Create some strata
ps <- prop_strat(z = rhc_X[samp, "z"],
                 X = rhc_X[samp, cov_samp], nstrata = 5)

# Save the sample sizes
tab <- table(ps$z, ps$base_strata)

# Choose the best sample sizes among the options provided
split_stratum(z = ps$z, X = ps$X, strata = ps$base_strata, ist = 1,
           nc = c(floor(tab[1, 1] * 0.25), ceiling(tab[1, 1] * 0.75)),
           nt = c(floor(tab[2, 1] * 0.3), ceiling(tab[2, 1] * 0.7)))

[Package optrefine version 1.1.0 Index]