bartc {bartCause}  R Documentation 
Causal Inference using Bayesian Additive Regression Trees
Description
Fits a collection of treatment and response models using the Bayesian Additive Regression Trees (BART) algorithm, producing estimates of treatment effects.
Usage
bartc(response, treatment, confounders, parametric, data, subset, weights,
method.rsp = c("bart", "tmle", "p.weight"),
method.trt = c("bart", "glm", "none"),
estimand = c("ate", "att", "atc"),
group.by = NULL,
commonSup.rule = c("none", "sd", "chisq"),
commonSup.cut = c(NA_real_, 1, 0.05),
args.rsp = list(), args.trt = list(),
p.scoreAsCovariate = TRUE, use.ranef = TRUE, group.effects = FALSE,
crossvalidate = FALSE,
keepCall = TRUE, verbose = TRUE,
seed = NA_integer_,
...)
Arguments
response 
A vector of the outcome variable, or a reference to such in the 
treatment 
A vector of the binary treatment variable, or a reference to 
confounders 
A matrix or data frame of covariates to be used in estimating the treatment
and response model. Can also be the righthandside of a formula (e.g.

parametric 
The righthandside of a formula (e.g. 
data 
An optional data frame or named list containing the 
subset 
An optional vector using to subset the data. Can refer to 
weights 
An optional vector of population weights used in model fitting and
estimating the treatment effect. Can refer to 
method.rsp 
A character string specifying which method to use when fitting the response
surface and estimating the treatment effect. Options are: 
method.trt 
A character string specifying which method to use when fitting the treatment
assignment mechanism, or a vector/matrix of propensity scores. Character
string options are: 
estimand 
A character string specifying which causal effect to target. Options are

group.by 
An optional factor that, when present, causes the treatment effect estimate to be calculated within each group. 
commonSup.rule 
Rule for exclusion of observations lacking in common support. Options are

commonSup.cut 
Cutoffs for 
p.scoreAsCovariate 
A logical such that when 
use.ranef 
Logical specifying if 
group.effects 
Logical specifying if effects should be calculated within groups if the

keepCall 
A logical such that when 
crossvalidate 
One of 
verbose 
A logical that when 
seed 
Optional integer specifying the desired pRNG seed. It should not be needed
when running singlethreaded  
args.rsp , args.trt , ... 
Further arguments to the treatment and response model fitting algorithms.
Arguments passed to the main function as ... will be used in both models.

Details
bartc
represents a collection of methods that primarily use the
Bayesian Additive Regression Trees (BART) algorithm to estimate causal
treatment effects with binary treatment variables and continuous or binary
outcomes. This requires models to be fit to the response surface (distribution
of the response as a function of treatment and confounders,
p(Y(1), Y(0)  X)
and optionally for treatment assignment mechanism
(probability of receiving treatment, i.e. propensity score,
Pr(Z = 1  X)
). The response surface model is used to impute
counterfactuals, which may then be adjusted together with the propensity score
to produce estimates of effects.
Similar to lm
, models can be specified symbolically. When the
data
term is present, it will be added to the search path for the
response
, treatment
, and confounders
variables. The
confounders must be specified devoid of any "left hand side", as they appear
in both of the models.
Response Surface
The response surface methods included are:

"bart"
 use BART to fit the response surface and produce individual estimates\hat{Y}(1)_i
and\hat{Y}(0)_i
. Treatment effect estimates are obtained by averaging the difference of these across the population of interest. 
"p.weight"
 individual effects are estimated as in"bart"
, but treatment effect estimates are obtained by using a propensity score weighted average. For the average treatment effect on the treated, these weights arep(z_i  x_i) / (\sum z / n)
. For ATC, replacez
with1  z
. For ATE,"p.weight"
is equal to"bart"
. 
"tmle"
 individual effects are estimated as in"bart"
and a weighted average is taken as in"p.weight"
, however the response surface estimates and propensity scores are corrected by using the Targeted Minimum Loss based Estimation method.
Treatment Assignment
The treatment assignment models are:

"bart"
 fit a binary BART directly to the treatment using all the confounders. 
"none"
 no modeling is done. Only applies when using response method"bart"
andp.scoreAsCovariate
isFALSE
. 
"glm"
 fit a binomial generalized linear model with logistic link and confounders included as linear terms. Finally, a vector or matrix of propensity scores can be supplied. Propensity score matrices should have a number of rows equal to the number of observations in the data and a number of columns equal to the number of posterior samples.
Parametrics
bartc
uses the stan4bart
package, when available, to fit semi
parametric surfaces. Equations can be given as to lm
. Grouping
structures are also permitted, and use the syntax of lmer
.
Generics
For a fitted model, the easiest way to analyze the resulting fit is to use the
generics fitted
, extract
, and
predict
to analyze specific quantities and
summary
to aggregate those values into
targets (e.g. ATE, ATT, or ATC).
Common Support Rules
Common support, or that the probability of receiving all treatment conditions
is nonzero within every area of the covariate space
(P(Z = 1  X = x) > 0
for all x
in the inferential sample), can be
enforced by excluding observations with high posterior uncertainty.
bartc
supports two common support rules through commonSup.rule
argument:

"sd"
 observations are cut from the inferential sample if:s_i^{f(1z)} > m_z + a \times sd(s_j^{f(z)}
, wheres_i^{f(1z)}
is the posteriors standard deviation of the predicted counterfactual for observationi
,s_j^f(z)
is the posterior standard deviation of the prediction for the observed treatment condition of observationj
,sd(s_j^{f(z)}
is the empirical standard deviation of those quantities, andm_z = max_j \{s_j^{f(z)}\}
for allj
in the same treatment group, i.e.Z_j = z
.a
is a constant to be passed in usingcommonSup.cut
and its default is 1. 
"chisq"
 observations are cut from the inferential sample if:(s_i^{f(1z)} / s_i^{f(z)})^2 > q_\alpha
, wheres_i
are as above andq_\alpha
, is the upper\alpha
percentile of a\chi^2
distribution with one degree of freedom, corresponding to a null hypothesis of equal variance. The default for\alpha
is 0.05, and it is specified using thecommonSup.cut
parameter.
Special Arguments
Some default arguments are unconventional or are passed in a unique fashion.
If
n.chains
is missing, unlike inbart2
a default of 10 is used.For
method.rsp == "tmle"
, a specialarg.trt
ofposteriorOfTMLE
determines if the TMLE correction should be applied to each posterior sample (TRUE
), or just the posterior mean (FALSE
).
Missing Data
Missingness is allowed only in the response. If some response values are
NA
, the BART models will be trained just for where data are available
and those values will be used to make predictions for the missing
observations. Missing observations are not used when calculating statistics
for assessing common support, although they may still be excluded on those
grounds. Further, missing observations may not be compatible with response
method "tmle"
.
Value
bartc
returns an object of class bartcFit
. Information about the
object can be derived by using methods summary
,
plot_sigma
, plot_est
, plot_indiv
,
and plot_support
. Numerical quantities are recovered with the
fitted
and
extract
generics. Predictions for new
observations are obtained with predict
.
Objects of class bartcFit
are lists containing items:
method.rsp 
character string specifying the method used to fit the response surface 
method.trt 
character string specifying the method used to fit the treatment assignment mechanism 
estimand 
character string specifying the targeted causal effect 
fit.rsp 
object containing the fitted response model 
data.rsp 

fit.trt 
object containing the fitted treatment model 
group.by 
optional factor vector containing the groups in which treatment effects are estimated 
est 
matrix or array of posterior samples of the treatment effect estimate 
p.score 
the vector of propensity scores used as a covariate in the response model, when applicable 
samples.p.score 
matrix or array of posterior samples of the propensity score, when applicable 
mu.hat.obs 
samples from the posterior of the expected value for individual responses under the observed treatment regime 
mu.hat.cf 
samples from the posterior of the expected value for individual responses under the counterfactual treatment 
name.trt 
character string giving the name of the treatment
variable in the data of 
trt 
vector of treatment assignments 
call 
how 
n.chains 
number of independent posterior sampler chains in response model 
commonSup.rule 
common support rule used for suppressing observations 
commonSup.cut 
common support parameter used to set cutoff when suppressing observations 
sd.obs 
vector of standard deviations of individual posterior predictors for observed treatment conditions 
sd.cf 
vector of standard deviations of individual posterior predictors for counterfactuals 
commonSup.sub 
logical vector expressing which observations are used when estimating treatment effects 
use.ranef 
logical for whether ranef models were used; only added when true 
group.effects 
logical for whether grouplevel estimates are made; only added when true 
seed 
a random seed for use when drawing from the posterior predictive distribution 
Author(s)
Vincent Dorie: vdorie@gmail.com.
References
Chipman, H., George, E. and McCulloch R. (2010) BART: Bayesian additive regression trees. The Annals of Applied Statistics 4(1), 266–298. The Institute of Mathematical Statistics. doi:10.1214/09AOAS285.
Hill, J. L. (2011) Bayesian Nonparametric Modeling for Causal Inference. Journal of Computational and Graphical Statistics 20(1), 217–240. Taylor & Francis. doi:10.1198/jcgs.2010.08162.
Hill, J. L. and Su Y. S. (2013) Assessing Lack of Common Support in Causal Inference Using Bayesian Nonparametrics: Implications for Evaluating the Effect of Breastfeeding on Children's Cognitive Outcomes The Annals of Applied Statistics 7(3), 1386–1420. The Institute of Mathematical Statistics. doi:10.1214/13AOAS630.
Carnegie, N. B. (2019) Comment: Contributions of Model Features to BART Causal Inference Performance Using ACIC 2016 Competition Data Statistical Science 34(1), 90–93. The Institute of Mathematical Statistics. doi:10.1214/18STS682
Hahn, P. R., Murray, J. S., and Carvalho, C. M. (2020) Bayesian Regression Tree Models for Causal Inference: Regularization, Confounding, and Heterogeneous Effects (with Discussion). Bayesian Analysis 15(3), 965–1056. International Society for Bayesian Analysis. doi:10.1214/19BA1195.
See Also
Examples
## fit a simple linear model
n < 100L
beta.z < c(.75, 0.5, 0.25)
beta.y < c(.5, 1.0, 1.5)
sigma < 2
set.seed(725)
x < matrix(rnorm(3 * n), n, 3)
tau < rgamma(1L, 0.25 * 16 * rgamma(1L, 1 * 32, 32), 16)
p.score < pnorm(x %*% beta.z)
z < rbinom(n, 1, p.score)
mu.0 < x %*% beta.y
mu.1 < x %*% beta.y + tau
y < mu.0 * (1  z) + mu.1 * z + rnorm(n, 0, sigma)
# low parameters only for example
fit < bartc(y, z, x, n.samples = 100L, n.burn = 15L, n.chains = 2L)
summary(fit)
## example to show refitting under the common support rule
fit2 < refit(fit, commonSup.rule = "sd")
fit3 < bartc(y, z, x, subset = fit2$commonSup.sub,
n.samples = 100L, n.burn = 15L, n.chains = 2L)