check_balance {natstrat} | R Documentation |
Check covariate balance of the control and treated groups
Description
Reports standardized differences in means between the treated and control group before and after choosing a subset of controls. These differences are reported both across strata and within strata. This function can also generate love plots of the same quantities.
Usage
check_balance(
z,
X,
st,
selected,
treated = 1,
control = 0,
denom_variance = "treated",
plot = FALSE,
message = TRUE
)
Arguments
z |
a factor with the |
X |
a data frame containing the covariates in the columns over which balance is desired. The number
of rows should equal the length of |
st |
a stratum vector with the |
selected |
a boolean vector including whether each unit was selected as part of the treated and control
groups for analysis. Should be the same length as |
treated |
which treatment value should be considered the treated units. This
must be one of the values of |
control |
which treatment value should be considered the control units. This
must be one of the values of |
denom_variance |
character stating what variance to use in the standardization:
either the default "treated", meaning the standardization will use the
treated variance (across all strata), where the treated group is declared in
the |
plot |
a boolean denoting whether to generate love plots for the standardized differences. |
message |
a boolean denoting whether to print a message about the level of balance achieved |
Value
List containing:
- sd_across
matrix with one row per covariate and two columns: one for the standardized difference before a subset of controls were selected and one for after.
- sd_strata
matrix similar to
sd_across
, but with separate standardized differences for each stratum for each covariate.- sd_strata_avg
matrix similar to
sd_across
, but taking the average of the standardized differences within the strata, weighted by stratum size.- plot_across
ggplot object plotting
sd_across
, only exists ifplot = TRUE
.- plot_strata
a named list of ggplot objects plotting
sd_strata
, one for each stratum, only exists ifplot = TRUE
.- plot_strata_avg
ggplot object plotting
sd_strata_avg
, only exists ifplot = TRUE
.- plot_pair
ggplot object with two facets displaying
sd_across
andsd_strata_avg
with one y-axis, only exists ifplot = TRUE
.
Examples
data('nh0506')
# Create strata
age_cat <- cut(nh0506$age,
breaks = c(19, 39, 50, 85),
labels = c('< 40 years', '40 - 50 years', '> 50 years'))
strata <- age_cat : nh0506$sex
# Balance age, race, education, poverty ratio, and bmi both across and within the levels of strata
constraints <- generate_constraints(
balance_formulas = list(age + race + education + povertyr + bmi ~ 1 + strata),
z = nh0506$z,
data = nh0506)
# Choose one control for every treated unit in each stratum,
# balancing the covariates as described by the constraints
results <- optimize_controls(z = nh0506$z,
X = constraints$X,
st = strata,
importances = constraints$importances,
ratio = 1)
cov_data <- nh0506[, c('sex', 'age', 'race', 'education', 'povertyr', 'bmi')]
# Check balance
stand_diffs <- check_balance(z = nh0506$z,
X = cov_data,
st = strata,
selected = results$selected,
plot = TRUE)