ce_estimate {CIMTx} | R Documentation |
Causal inference with multiple treatments using observational data
Description
The function ce_estimate
implements the 6
different methods for causal inference with
multiple treatments using observational data.
Usage
ce_estimate(
y,
x,
w,
method,
formula = NULL,
discard = FALSE,
estimand,
trim_perc = NULL,
sl_library,
reference_trt,
boot = FALSE,
nboots,
verbose_boot = TRUE,
ndpost = 1000,
caliper = 0.25,
n_cluster = 5,
...
)
Arguments
y |
A numeric vector (0, 1) representing a binary outcome. |
x |
A dataframe, including all the covariates but not treatments. |
w |
A numeric vector representing the treatment groups. |
method |
A character string. Users can selected from the
following methods including |
formula |
A |
discard |
A logical indicating whether to use the discarding rules
for the BART based methods. The default is |
estimand |
A character string representing the type of causal estimand.
Only |
trim_perc |
A 2-vector numeric value indicating the percentile
at which the inverse probability of treatment weights should be trimmed.
The default is |
sl_library |
A character vector of prediction algorithms.
A list of functions included in the SuperLearner package
can be found with |
reference_trt |
A numeric value indicating reference treatment group for ATT effect. |
boot |
A logical indicating whether or not to use nonparametric
bootstrap to calculate the 95% confidence intervals of the causal
effect estimates. The default is |
nboots |
A numeric value representing the number of bootstrap samples. |
verbose_boot |
A logical value indicating whether to
print the progress of nonparametric bootstrap.
The default is |
ndpost |
A numeric value indicating the number of posterior draws
for the Bayesian methods ( |
caliper |
A numeric value denoting the caliper which should be used
when matching ( |
n_cluster |
A numeric value denoting the number of clusters to form
using K means clustering on the logit of GPS when |
... |
Other parameters that can be passed through to functions. |
Value
A summary of the effect estimates can be obtained
with summary
function. For VM, the output contains the number
of matched individuals. For BART and discard = TRUE
,
the output contains number of discarded individuals. For IPTW related
method and boot = FALSE
, the weight distributions can be
visualized using plot
function. For BART and RA, the output
contains a list of the posterior samples of causal estimands.
References
Hu, L., Gu, C., Lopez, M., Ji, J., & Wisnivesky, J. (2020). Estimation of causal effects of multiple treatments in observational studies with a binary outcome. Statistical Methods in Medical Research, 29(11), 3218–3234.
Hu, L., Gu, C. Estimation of causal effects of multiple treatments in healthcare database studies with rare outcomes. Health Service Outcomes Research Method 21, 287–308 (2021).
Sparapani R, Spanbauer C, McCulloch R Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package. Journal of Statistical Software, 97(1), 1-66.
Hadley Wickham, Romain François, Lionel Henry and Kirill Müller (2021). dplyr: A Grammar of Data Manipulation. R package version 1.0.7. URL: https://CRAN.R-project.org/package=dplyr
Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0
Matthew Cefalu, Greg Ridgeway, Dan McCaffrey, Andrew Morral, Beth Ann Griffin and Lane Burgette (2021). twang: Toolkit for Weighting and Analysis of Nonequivalent Groups. R package version 2.5. URL:https://CRAN.R-project.org/package=twang
Noah Greifer (2021). WeightIt: Weighting for Covariate Balance in Observational Studies. R package version 0.12.0. URL:https://CRAN.R-project.org/package=WeightIt
Hadley Wickham (2019). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.4.0. URL:https://CRAN.R-project.org/package=stringr
Andrew Gelman and Yu-Sung Su (2020). arm: Data Analysis Using Regression and Multilevel/Hierarchical Models. R package version 1.11-2. URL:https://CRAN.R-project.org/package=arm
Wood, S.N. (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society (B) 73(1):3-36
Eric Polley, Erin LeDell, Chris Kennedy and Mark van der Laan (2021). SuperLearner: Super Learner Prediction. R package version 2.0-28. URL:https://CRAN.R-project.org/package=SuperLearner
Susan Gruber, Mark J. van der Laan (2012). tmle: An R Package for Targeted Maximum Likelihood Estimation. Journal of Statistical Software, 51(13), 1-35.
Jasjeet S. Sekhon (2011). Multivariate and Propensity Score Matching Software with Automated Balance Optimization: The Matching Package for R. Journal of Statistical Software, 42(7), 1-52
H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
Claus O. Wilke (2020). cowplot: Streamlined Plot Theme and Plot Annotations for 'ggplot2'. R package version 1.1.1. URL:https://CRAN.R-project.org/package=cowplot
Elio Campitelli (2021). metR: Tools for Easier Analysis of Meteorological Fields. R package version 0.11.0. URL:https://github.com/eliocamp/metR
Hadley Wickham (2021). tidyr: Tidy Messy Data. R package version 1.1.4. https://CRAN.R-project.org/package=tidyr
Microsoft Corporation and Steve Weston (2020). doParallel: Foreach Parallel Adaptor for the 'parallel' Package. R package version 1.0.16. URL:https://CRAN.R-project.org/package=doParallel
Microsoft and Steve Weston (2020). foreach: Provides Foreach Looping Construct. R package version 1.5.1. URL:https://CRAN.R-project.org/package=foreach
Examples
lp_w_all <-
c(
".4*x1 + .1*x2 - .1*x4 + .1*x5", # w = 1
".2 * x1 + .2 * x2 - .2 * x4 - .3 * x5"
) # w = 2
nlp_w_all <-
c(
"-.5*x1*x4 - .1*x2*x5", # w = 1
"-.3*x1*x4 + .2*x2*x5"
) # w = 2
lp_y_all <- rep(".2*x1 + .3*x2 - .1*x3 - .1*x4 - .2*x5", 3)
nlp_y_all <- rep(".7*x1*x1 - .1*x2*x3", 3)
X_all <- c(
"rnorm(0, 0.5)", # x1
"rbeta(2, .4)", # x2
"runif(0, 0.5)", # x3
"rweibull(1,2)", # x4
"rbinom(1, .4)" # x5
)
set.seed(111111)
data <- data_sim(
sample_size = 300,
n_trt = 3,
x = X_all,
lp_y = lp_y_all,
nlp_y = nlp_y_all,
align = FALSE,
lp_w = lp_w_all,
nlp_w = nlp_w_all,
tau = c(-1.5, 0, 1.5),
delta = c(0.5, 0.5),
psi = 1
)
ce_estimate(
y = data$y, x = data$covariates, w = data$w,
ndpost = 100, method = "RA", estimand = "ATE"
)