npCBPS {CBPS}R Documentation

Non-Parametric Covariate Balancing Propensity Score (npCBPS) Estimation

Description

npCBPS is a method to estimate weights interpretable as (stabilized) inverse generlized propensity score weights, w_i = f(T_i)/f(T_i|X), without actually estimating a model for the treatment to arrive at f(T|X) estimates. In brief, this works by maximizing the empirical likelihood of observing the values of treatment and covariates that were observed, while constraining the weights to be those that (a) ensure balance on the covariates, and (b) maintain the original means of the treatment and covariates.

In the continuous treatment context, this balance on covariates means zero correlation of each covariate with the treatment. In binary or categorical treatment contexts, balance on covariates implies equal means on the covariates for observations at each level of the treatment. When given a numeric treatment the software handles it continuously. To handle the treatment as binary or categorical is must be given as a factor.

Furthermore, we apply a Bayesian variant that allows the correlation of each covariate with the treatment to be slightly non-zero, as might be expected in a a given finite sample.

Estimates non-parametric covariate balancing propensity score weights.

### @aliases npCBPS npCBPS.fit

Usage

npCBPS(formula, data, na.action, corprior = 0.01, print.level = 0, ...)

Arguments

formula

An object of class formula (or one that can be coerced to that class): a symbolic description of the model to be fitted.

data

An optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which CBPS is called.

na.action

A function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset.

corprior

Prior hyperparameter controlling the expected amount of correlation between each covariate and the treatment. Specifically, the amount of correlation between the k-dimensional covariates, X, and the treatment T after weighting is assumed to have prior distribution MVN(0,sigma^2 I_k). We conceptualize sigma^2 as a tuning parameter to be used pragmatically. It's default of 0.1 ensures that the balance constraints are not too harsh, and that a solution is likely to exist. Once the algorithm works at such a high value of sigma^2, the user may wish to attempt values closer to 0 to get finer balance.

print.level

Controls verbosity of output to the screen while npCBPS runs. At the default of print.level=0, little output is produced. It print.level>0, it outputs diagnostics including the log posterior (log_post), the log empirical likelihood associated with the weights (log_el), and the log prior probability of the (weighted) correlation of treatment with the covariates.

...

Other parameters to be passed.

Value

weights

The optimal weights

y

The treatment vector used

x

The covariate matrix

model

The model frame

call

The matched call

formula

The formula supplied

data

The data argument

log.p.eta

The log density for the (weighted) correlation of the covariates with the treatment, given the choice of prior (corprior)

log.el

The log empirical likelihood of the observed data at the chosen set of IPW weights.

eta

A vector describing the correlation between the treatment and each covariate on the weighted data at the solution.

sumw0

The sum of weights, provided as a check on convergence. This is always 1 when convergence occurs unproblematically. If it differs from 1 substantially, no solution perfectly satisfying the conditions was found, and the user may consider a larger value of corprior.

Author(s)

Christian Fong, Chad Hazlett, and Kosuke Imai

References

Fong, Christian, Chad Hazlett, and Kosuke Imai. “Parametric and Nonparametric Covariate Balancing Propensity Score for General Treatment Regimes.” Unpublished Manuscript. http://imai.princeton.edu/research/files/CBGPS.pdf

Examples


##Generate data
data(LaLonde)

## Restricted two only two covariates so that it will run quickly.
## Performance will remain good if the full LaLonde specification is used
fit <- npCBPS(treat ~ age + educ, data = LaLonde, corprior=.1/nrow(LaLonde))
plot(fit)


[Package CBPS version 0.23 Index]