split_stratum {optrefine}R Documentation

Split one stratum into multiple strata

Description

Split one stratum into multiple with specified sample sizes.

Usage

split_stratum(
  z,
  X,
  strata,
  ist,
  nc,
  nt,
  wMax = 5,
  wEach = 1,
  solver = "Rglpk",
  integer = FALSE,
  threads = NULL
)

Arguments

z

Vector of treatment assignment

X

Covariate matrix or data.frame

strata

vector of initial strata assignments; only used if object is not supplied. Can be NULL, in which case an initial stratification using the quintiles of the propensity score is generated using prop_strat() and the generated propensity score is also added to the X matrix as an extra covariate

ist

the stratum to be split

nc

a vector stating how many control units to place in each of the new split strata. The sum must be the total number of controls in the stratum to be split

nt

a vector stating how many treated units to place in each of the new split strata. The sum must be the total number of treated units in the stratum to be split

wMax

the weight the objective places on the maximum epsilon

wEach

the weight the objective places on each epsilon

solver

character specifying the optimization software to use. Options are "Rglpk" or "gurobi". The default is "gurobi"

integer

boolean whether to use integer programming instead of randomized rounding. Default is FALSE. It is not recommended to set this to TRUE as the problem may never finish

threads

how many threads to use in the optimization if using "gurobi" as the solver. Default will use all available threads

Value

A list containing the following elements:

Examples

# Generate a small data set
set.seed(25)
samp <- sample(1:nrow(rhc_X), 1000)
cov_samp <- sample(1:26, 10)

# Create some strata
ps <- prop_strat(z = rhc_X[samp, "z"],
                 X = rhc_X[samp, cov_samp], nstrata = 5)

# Save the sample sizes
tab <- table(ps$z, ps$base_strata)

# Choose the best sample sizes among the options provided
split_stratum(z = ps$z, X = ps$X, strata = ps$base_strata, ist = 1,
           nc = c(floor(tab[1, 1] * 0.25), ceiling(tab[1, 1] * 0.75)),
           nt = c(floor(tab[2, 1] * 0.3), ceiling(tab[2, 1] * 0.7)))


[Package optrefine version 1.1.0 Index]