NBCEE {BCEE} | R Documentation |
Naive BCEE Implementation
Description
N-BCEE implementation of the BCEE algorithm. This function supports exposures that can be modeled with generalized linear models (e.g., binary, continuous or Poisson), but only continuous outcomes.
Usage
NBCEE(X, Y, U, omega, niter = 5000, nburn = 500, nthin = 10,
maxmodelX = NA, maxmodelY = NA, family.X = "gaussian")
Arguments
X |
A vector of observed values for the exposure. |
Y |
A vector of observed values for the continuous outcome. |
U |
A matrix of observed values for the |
omega |
The value of the hyperparameter omega in the BCEE's outcome model prior distribution. A recommended implementation is to take |
niter |
The number of post burn-in iterations in the Markov chain Monte Carlo model composition (MC^3) algorithm (Madigan et al. 1995), prior to applying thinning. The default is 5000. |
nburn |
The number of burn-in iterations (prior to applying thinning). The default is 500. |
nthin |
The thinning of the chain. The default is 10. |
maxmodelX |
The maximum number of exposure models the algorithm can explore. See |
maxmodelY |
The maximum number of distinct outcome models that the algorithm can explore. Choosing a smaller value can shorten computing time. However, choosing a value that is too small will cause the algorithm to crash. The default is |
family.X |
A description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. (See |
Details
NBCEE
assumes there are no missing values in the objects X
, Y
and U
. The na.omit
function which removes cases with missing data or an imputation package might be helpful.
Value
betas |
A vector containing the sampled values for the exposure effect. |
models.X |
A Boolean matrix identifying the sampled exposure models. See |
models.Y |
A Boolean matrix identifying the sampled outcome models. Each row corresponds to a sampled outcome model. Within each row, the |
Note
Variability of the exposure effect estimator is generally underestimated by the N-BCEE implementation of BCEE. The A-BCEE, which also happens to be faster, is thus preferred. Another option is to use N-BCEE with nonparametric bootstrap (B-BCEE) to correctly estimate variability.
The difference in computing time between A-BCEE and N-BCEE is mostly explainable by the method used to compute the posterior distribution of the exposure model. In A-BCEE, this posterior distribution is calculated as a first step using bic.glm
. In N-BCEE, the posterior distribution of the exposure model is computed inside the MC^3 algorithm.
Author(s)
Denis Talbot, Genevieve Lefebvre, Juli Atherton.
References
Madigan, D., York, J., Allard, D. (1995) Bayesian graphical models for discrete data, International Statistical Review, 63, 215-232.
Talbot, D., Lefebvre, G., Atherton, J. (2015) The Bayesian causal effect estimation algorithm, Journal of Causal Inference, 3(2), 207-236.
See Also
Examples
# In this example, U1 and U2 are potential confounding covariates
# generated as independent N(0,1).
# X is generated as a function of both U1 and U2 with a N(0,1) error.
# Y is generated as a function of X and U1 with a N(0,1) error.
# Variable U1 is the only confounder.
# The causal effect of X on Y equals 1.
# The exposure effect estimator (beta hat) in the outcome model
# including U1 and U2 or including U1 only is unbiased.
# The sample size is n = 200.
# Generating the data
set.seed(418949);
U1 = rnorm(200);
U2 = rnorm(200);
X = 0.5*U1 + 1*U2 + rnorm(200);
Y = 1*X + 0.5*U1 + rnorm(200);
# Using NBCEE to estimate the causal exposure effect
n = 200;
omega.c = 500;
results = NBCEE(X,Y,cbind(U1,U2), omega = omega.c*sqrt(n),
niter = 1000, nthin = 5, nburn = 20);
# The posterior mean of the exposure effect:
mean(results$betas);
# The posterior standard deviation of the exposure effect:
sd(results$betas);
# The posterior probability of inclusion of each covariate in the exposure model:
colMeans(results$models.X);
# The posterior distribution of the exposure model:
table(apply(results$models.X, 1, paste0, collapse = ""));
# The posterior probability of inclusion of each covariate in the outcome model:
colMeans(results$models.Y);
# The posterior distribution of the outcome model:
table(apply(results$models.Y, 1, paste0, collapse = ""));