GBCEE {BCEE} | R Documentation |
Generalized BCEE algorithm
Description
A generalized double robust Bayesian model averaging approach to causal effect estimation. This function accommodates both binary and continuous exposures and outcomes. More details are available in Talbot and Beaudoin (2020).
Usage
GBCEE(X, Y, U, omega, niter = 5000, family.X = "gaussian",
family.Y = "gaussian", X1 = 1, X0 = 0, priorX = NA, priorY = NA, maxsize = NA,
OR = 20, truncation = c(0.01, 0.99), var.comp = "asymptotic", B = 200, nsampX = 30)
Arguments
X |
A vector of observed values for the exposure. |
Y |
A vector of observed values for the outcome. |
U |
A matrix of observed values for the |
omega |
The value of the hyperparameter omega in the BCEE's outcome model prior distribution. A recommended implementation is to take |
niter |
The number of iterations in the Markov chain Monte Carlo model composition (MC^3) algorithm (Madigan et al. 1995). The default is 5000, but larger values are recommended when the number of potential confounding covariates is large. |
family.X |
Distribution to be used for the exposure model. This should be |
family.Y |
Distribution to be used for the outcome model. This should be |
X1 |
The value of |
X0 |
The value of |
priorX |
A vector of length |
priorY |
A vector of length |
maxsize |
The maximum number of covariates that can be included in a given exposure or outcome model. The default is |
OR |
A number specifying the maximum ratio for excluding models in Occam's window for the outcome modeling step. All outcome models whose posterior probability is more than |
truncation |
A vector of length 2 indicating the smallest and largest values for the estimated propensity score ( |
var.comp |
The method for computing the variance of the targeted maximum likelihood estimators in the BCEE algorithm. The possible values are |
B |
The number of bootstrap samples when estimating the variance using the nonparametric bootstrap. The default is 200. |
nsampX |
The number of samples to take from the exposure distribution for the Monte Carlo integration when X is continuous and Y is binary. The default is 30. |
Details
When both Y
and X
are continuous, GBCEE
estimates \Delta = E[Y^{x+1}] - E[Y^x]
, assuming a linear effect of X
on Y
. When Y
is continuous and X
is binary, GBCEE
estimates \Delta = E[Y^{X1}] - E[Y^{X0}]
. When Y
and X
are binary, GBCEE
estimates both \Delta = E[Y^{X1}] - E[Y^{X0}]
and \Delta = E[Y^{X1}]/E[Y^{X0}]
. When Y
is binary and X
is continuous, GBCEE
estimates the slope of the logistic marginal structural working model logit(E[Y^{x}]) = \beta_0 + \beta_1 x
The GBCEE
function first computes the exposure model's posterior distribution using a Markov chain Monte Carlo model composition (MC^3) algorithm (Madigan et al. 1995). The outcome model's posterior distribution is then computed using MC^3 (Madigan et al., 1995) as described in Section 3.4 of Talbot and Beaudoin (2022).
GBCEE
assumes there are no missing values in the objects X
, Y
and U
. The na.omit
function which removes cases with missing data or an imputation package might be helpful.
Value
beta |
The model averaged estimate of the causal effect ( |
stderr |
The estimated standard error of the causal effect estimate. |
models.X |
A matrix giving the posterior distribution of the exposure model. Each row corresponds to an exposure model. Within each row, the first |
models.Y |
A matrix giving the posterior distribution of the outcome model after applying the Occam's window. Each row corresponds to an outcome model. Within each row, the first |
Author(s)
Denis Talbot
References
Madigan, D., York, J., Allard, D. (1995) Bayesian graphical models for discrete data, International Statistical Review, 63, 215-232.
Madigan, D., Raftery, A. E. (1994) Model selection and accounting for model uncertainty in graphical models using Occam's window, Journal of the American Statistical Association, 89 (428), 1535-1546.
Talbot, D., Beaudoin, C (2022) A generalized double robust Bayesian model averaging approach to causal effect estimation with application to the Study of Osteoporotic Fractures, Journal of Causal Inference, 10(1), 335-371.
See Also
Examples
#Example:
#In this example, both U1 and U2 are potential confounding covariates.
#Both are generated as independent N(0,1).
#X is generated as a function of both U1 and U2 with a N(0,1) error.
#Y is generated as a function of X and U1 with a N(0,1) error.
#Thus, only U1 is a confounder.
#Since both X and Y are continuous, the causal contrast estimated
#by GBCEE is E[Y^{x+1}] - E[Y^{x}] assuming a linear trend.
#The true value of the causal effect is 1.
#Unbiased estimation is possible when adjusting for U1 or
#adjusting for both U1 and U2.
#Generating the data
set.seed(418949);
U1 = rnorm(200);
U2 = rnorm(200);
X = 0.5*U1 + 1*U2 + rnorm(200);
Y = 1*X + 0.5*U1 + rnorm(200);
#Using GBCEE to estimate the causal exposure effect
#Very few iterations are necessary since there are only 2 covariates
results = GBCEE(X,Y,cbind(U1,U2), omega = 500*sqrt(200), niter = 50,
family.X = "gaussian", family.Y = "gaussian");
#Causal effect estimate
results$beta;
#Estimated standard error
results$stderr;
#Results from individual models
results$models.Y;
#Posterior probability of inclusion of each covariate in the outcome model
colSums(results$models.Y[,1:2]*results$models.Y[,ncol(results$models.Y)]);