hbal {hbal}R Documentation

Hierarchically Regularized Entropy Balancing

Description

hbal performs hierarchically regularized entropy balancing such that the covariate distributions of the control group match those of the treatment group. hbal automatically expands the covariate space to include higher order terms and uses cross-validation to select variable penalties for the balancing conditions.

hbal performs hierarchically regularized entropy balancing such that the covariate distributions of the control group match those of the treatment group. hbal automatically expands the covariate space to include higher order terms and uses cross-validation to select variable penalties for the balancing conditions.

Usage

hbal(data, Treat, X, Y = NULL, w = NULL, 
     X.expand = NULL, X.keep = NULL, expand.degree = 1,
     coefs = NULL, max.iterations = 200, cv = NULL, folds = 4,
     ds = FALSE, group.exact = NULL, group.alpha = NULL,
     term.alpha = NULL, constraint.tolerance = 1e-3, print.level = 0,
     grouping = NULL, group.labs = NULL, linear.exact = TRUE, shuffle.treat = TRUE,
     exclude = NULL,force = FALSE, seed = 94035)

Arguments

data

a dataframe that contains the treatment, outcome, and covariates.

Treat

a character string of the treatment variable.

X

a character vector of covariate names to balance on.

Y

a character string of the outcome variable.

w

a character string of the weighting variable for base weights

X.expand

a character vector of covariate names for serial expansion.

X.keep

a character vector of covariate names to keep regardless of whether they are selected in double selection.

expand.degree

degree of series expansion. 1 means no expansion. Default is 1.

coefs

initial coefficients for the reweighting algorithm (lambdas).

max.iterations

maximum number of iterations. Default is 200.

cv

whether to use cross validation. Default is TRUE.

folds

number of folds for cross validation. Only used when cv is TRUE.

ds

whether to perform double selection prior to balancing. Default is FALSE.

group.exact

binary indicator of whether each covariate group should be exact balanced.

group.alpha

penalty for each covariate group

term.alpha

named vector of ridge penalties, only takes 0 or 1.

constraint.tolerance

tolerance level for overall imbalance. Default is 1e-3.

print.level

details of printed output.

grouping

different groupings of the covariates. Must be specified if expand is FALSE.

group.labs

labels for user-supplied groups

linear.exact

seek exact balance on the level terms

shuffle.treat

whether to use cross-validation on the treated units. Default is TRUE.

exclude

list of covariate name pairs or triplets to be excluded.

force

binary indicator of whether to expand covariates when there are too many

seed

random seed to be set. Set random seed when cv=TRUE for reproducibility.

Details

In the simplest set-up, user can just pass in {Treatment, X, Y}. The default settings will serially expand X to include higher order terms, hierarchically residualize these terms, perform double selection to only keep the relevant variables and use cross-validation to select penalities for different groupings of the covariates.

Value

An list object of class hbal with the following elements:

coefs

vector that contains coefficients from the reweighting algorithm.

mat

matrix of serially expanded covariates if expand=TRUE. Otherwise, the original covariate matrix is returned.

penalty

vector of ridge penalties used for each covariate

weights

vector that contains the control group weights assigned by hbal.

W

vector of treatment status

Y

vector of outcome

Author(s)

Yiqing Xu, Eddie Yang

Yiqing Xu <yiqingxu@stanford.edu>, Eddie Yang <z5yang@ucsd.edu>

References

Xu, Y., & Yang, E. (2022). Hierarchically Regularized Entropy Balancing. Political Analysis, 1-8. doi:10.1017/pan.2022.12

Examples

# Example 1
set.seed(1984)
N <- 500
X1 <- rnorm(N)
X2 <- rbinom(N,size=1,prob=.5)
X <- cbind(X1, X2)
treat <- rbinom(N, 1, prob=0.5) # Treatment indicator
y <- 0.5 * treat + X[,1] + X[,2] + rnorm(N) # Outcome
dat <- data.frame(treat=treat, X, Y=y)
out <- hbal(Treat = 'treat', X = c('X1', 'X2'), Y = 'Y', data=dat)
summary(hbal::att(out))

# Example 2
## Simulation from Kang and Shafer (2007).
library(MASS)
set.seed(1984)
n <- 500
X <- mvrnorm(n, mu = rep(0, 4), Sigma = diag(4))
prop <- 1 / (1 + exp(X[,1] - 0.5 * X[,2] + 0.25*X[,3] + 0.1 * X[,4]))
# Treatment indicator
treat <- rbinom(n, 1, prop)
# Outcome
y <- 210 + 27.4*X[,1] + 13.7*X[,2] + 13.7*X[,3] + 13.7*X[,4] + rnorm(n)
# Observed covariates
X.mis <- cbind(exp(X[,1]/2), X[,2]*(1+exp(X[,1]))^(-1)+10, 
    (X[,1]*X[,3]/25+.6)^3, (X[,2]+X[,4]+20)^2)
dat <- data.frame(treat=treat, X.mis, Y=y)
out <- hbal(Treat = 'treat', X = c('X1', 'X2', 'X3', 'X4'), Y='Y', data=dat)
summary(att(out))

[Package hbal version 1.2.12 Index]