bal {mvGPS}R Documentation

Construct Covariate Balance Statistics for Models with Multivariate Exposure

Description

Assessing balance between exposure(s) and confounders is key when performing causal analysis using propensity scores. We provide a list of several models to generate weights to use in causal inference for multivariate exposures, and test the balancing property of these weights using weighted Pearson correlations. In addition, returns the effective sample size.

Usage

bal(
  model_list,
  D,
  C,
  common = FALSE,
  trim_w = FALSE,
  trim_quantile = 0.99,
  all_uni = TRUE,
  ...
)

Arguments

model_list

character string identifying which methods to use when constructing weights. See details for a list of available models

D

numeric matrix of dimension n by m designating values of the exposures

C

either a list of numeric matrices of length m of dimension n by p_j designating values of the confounders for each exposure value or if common is TRUE a single matrix of of dimension n by p that represents common confounders for all exposures.

common

logical indicator for whether C is a single matrix of common confounders for all exposures. default is FALSE meaning C must be specified as list of confounders of length m.

trim_w

logical indicator for whether to trim weights. default is FALSE

trim_quantile

numeric scalar used to specify the upper quantile to trim weights if applicable. default is 0.99

all_uni

logical indicator. If TRUE then all univariate models specified in model_list will be estimated for each exposure. If FALSE will only estimate weights for the first exposure

...

additional arguments to pass to weightit function if specifying one of these models in the model_list

Details

When using propensity score methods for causal inference it is crucial to check the balancing property of the covariates and exposure(s). To do this in the multivariate case we first use a weight generating method from the available list shown below.

Methods Available

Note that only the mvGPS method is multivariate and all others are strictly univariate. For univariate methods weights are estimated for each exposure separately using the weightit function given the confounders for that exposure in C when all_uni=TRUE. To estimate weights for only the first exposure set all_uni=FALSE.

It is also important to note that the weights for each method can be trimmed at the desired quantile by setting trim_w=TRUE and setting trim_quantile in \[0.5, 1\]. Trimming is done at both the upper and lower bounds. For further details see mvGPS on how trimming is performed.

Balance Metrics

In this package we include three key balancing metrics to summarize balance across all of the exposures.

Euclidean distance is calculated using the origin point as reference, e.g. for m=2 exposures the reference point is \[0, 0\]. In this way we are calculating how far the observed set of correlation points are from perfect balance.

Maximum absolute correlation reports the largest single imbalance between the exposures and the set of confounders. It is often a key diagnostic as even a single confounder that is sufficiently out of balance can reduce performance.

Average absolute correlation is the sum of the exposure-confounder correlations. This metric summarizes how well, on average, the entire set of exposures is balanced.

Effective Sample Size

Effective sample size, ESS, is defined as

ESS=(\Sigma_i w_i)^{2}/\Sigma_i w_i^2,

where w_i are the estimated weights for a particular method (Kish 1965). Note that when w=1 for all units that the ESS is equal to the sample size n. ESS decreases when there are extreme weights or high variability in the weights.

Value

References

Fong C, Hazlett C, Imai K (2018). “Covariate balancing propensity score for a continuous treatment: application to the efficacy of political advertisements.” Annals of Applied Statistics, In-Press.

Kish L (1965). Survey Sampling. John Wiley \& Sons, New York.

Tübbicke S (2020). “Entropy Balancing for Continuous Treatments.” arXiv e-prints. 2001.06281.

Zhu Y, Coffman DL, Ghosh D (2015). “A boosting algorithm for estimating generalized propensity scores with continuous treatments.” Journal of Causal Inference, 3(1), 25-40.

Examples

#simulating data
sim_dt <- gen_D(method="u", n=150, rho_cond=0.2, s_d1_cond=2, s_d2_cond=2,
k=3, C_mu=rep(0, 3), C_cov=0.1, C_var=1, d1_beta=c(0.5, 1, 0),
d2_beta=c(0, 0.3, 0.75), seed=06112020)
D <- sim_dt$D
C <- sim_dt$C

#generating weights using mvGPS and potential univariate alternatives
require(WeightIt)
bal_sim <- bal(model_list=c("mvGPS", "entropy", "CBPS", "PS", "GBM"), D,
C=list(C[, 1:2], C[, 2:3]))

#overall summary statistics
bal_sim$bal_metrics

#effective sample sizes
bal_sim$ess

#we can also trim weights for all methods
bal_sim_trim <- bal(model_list=c("mvGPS", "entropy", "CBPS", "PS", "GBM"), D,
C=list(C[, 1:2], C[, 2:3]), trim_w=TRUE, trim_quantile=0.9, p.mean=0.5)
#note that in this case we can also pass additional arguments using in
#WeighIt package for entropy, CBPS, PS, and GBM such as specifying the p.mean

#can check to ensure all the weights have been properly trimmed at upper and
#lower bound
all.equal(unname(unlist(lapply(bal_sim$W, quantile, 0.99))),
unname(unlist(lapply(bal_sim_trim$W, max))))
all.equal(unname(unlist(lapply(bal_sim$W, quantile, 1-0.99))),
unname(unlist(lapply(bal_sim_trim$W, min))))


[Package mvGPS version 1.2.2 Index]