ce_estimate {CIMTx}R Documentation

Causal inference with multiple treatments using observational data

Description

The function ce_estimate implements the 6 different methods for causal inference with multiple treatments using observational data.

Usage

ce_estimate(
  y,
  x,
  w,
  method,
  formula = NULL,
  discard = FALSE,
  estimand,
  trim_perc = NULL,
  sl_library,
  reference_trt,
  boot = FALSE,
  nboots,
  verbose_boot = TRUE,
  ndpost = 1000,
  caliper = 0.25,
  n_cluster = 5,
  ...
)

Arguments

y

A numeric vector (0, 1) representing a binary outcome.

x

A dataframe, including all the covariates but not treatments.

w

A numeric vector representing the treatment groups.

method

A character string. Users can selected from the following methods including "RA", "VM", "BART", "TMLE", "IPTW-Multinomial", "IPTW-GBM", "IPTW-SL", "RAMS-Multinomial", "RAMS-GBM", "RAMS-SL".

formula

A formula object representing the variables used for the analysis. The default is to use all terms specified in x.

discard

A logical indicating whether to use the discarding rules for the BART based methods. The default is FALSE.

estimand

A character string representing the type of causal estimand. Only "ATT" or "ATE" is allowed. When the estimand = "ATT", users also need to specify the reference treatment group by setting the reference_trt argument.

trim_perc

A 2-vector numeric value indicating the percentile at which the inverse probability of treatment weights should be trimmed. The default is NULL.

sl_library

A character vector of prediction algorithms. A list of functions included in the SuperLearner package can be found with listWrappers.

reference_trt

A numeric value indicating reference treatment group for ATT effect.

boot

A logical indicating whether or not to use nonparametric bootstrap to calculate the 95% confidence intervals of the causal effect estimates. The default is FALSE.

nboots

A numeric value representing the number of bootstrap samples.

verbose_boot

A logical value indicating whether to print the progress of nonparametric bootstrap. The default is TRUE.

ndpost

A numeric value indicating the number of posterior draws for the Bayesian methods ("BART" and "RA").

caliper

A numeric value denoting the caliper which should be used when matching (method = "VM") on the logit of GPS within each cluster formed by K-means clustering. The caliper is in standardized units. For example, caliper = 0.25 means that all matches greater than 0.25 standard deviations of the logit of GPS are dropped. The default value is 0.25.

n_cluster

A numeric value denoting the number of clusters to form using K means clustering on the logit of GPS when method = "VM". The default value is 5.

...

Other parameters that can be passed through to functions.

Value

A summary of the effect estimates can be obtained with summary function. For VM, the output contains the number of matched individuals. For BART and discard = TRUE, the output contains number of discarded individuals. For IPTW related method and boot = FALSE, the weight distributions can be visualized using plot function. For BART and RA, the output contains a list of the posterior samples of causal estimands.

References

Hu, L., Gu, C., Lopez, M., Ji, J., & Wisnivesky, J. (2020). Estimation of causal effects of multiple treatments in observational studies with a binary outcome. Statistical Methods in Medical Research, 29(11), 3218–3234.

Hu, L., Gu, C. Estimation of causal effects of multiple treatments in healthcare database studies with rare outcomes. Health Service Outcomes Research Method 21, 287–308 (2021).

Sparapani R, Spanbauer C, McCulloch R Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package. Journal of Statistical Software, 97(1), 1-66.

Hadley Wickham, Romain François, Lionel Henry and Kirill Müller (2021). dplyr: A Grammar of Data Manipulation. R package version 1.0.7. URL: https://CRAN.R-project.org/package=dplyr

Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0

Matthew Cefalu, Greg Ridgeway, Dan McCaffrey, Andrew Morral, Beth Ann Griffin and Lane Burgette (2021). twang: Toolkit for Weighting and Analysis of Nonequivalent Groups. R package version 2.5. URL:https://CRAN.R-project.org/package=twang

Noah Greifer (2021). WeightIt: Weighting for Covariate Balance in Observational Studies. R package version 0.12.0. URL:https://CRAN.R-project.org/package=WeightIt

Hadley Wickham (2019). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.4.0. URL:https://CRAN.R-project.org/package=stringr

Andrew Gelman and Yu-Sung Su (2020). arm: Data Analysis Using Regression and Multilevel/Hierarchical Models. R package version 1.11-2. URL:https://CRAN.R-project.org/package=arm

Wood, S.N. (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society (B) 73(1):3-36

Eric Polley, Erin LeDell, Chris Kennedy and Mark van der Laan (2021). SuperLearner: Super Learner Prediction. R package version 2.0-28. URL:https://CRAN.R-project.org/package=SuperLearner

Susan Gruber, Mark J. van der Laan (2012). tmle: An R Package for Targeted Maximum Likelihood Estimation. Journal of Statistical Software, 51(13), 1-35.

Jasjeet S. Sekhon (2011). Multivariate and Propensity Score Matching Software with Automated Balance Optimization: The Matching Package for R. Journal of Statistical Software, 42(7), 1-52

H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.

Claus O. Wilke (2020). cowplot: Streamlined Plot Theme and Plot Annotations for 'ggplot2'. R package version 1.1.1. URL:https://CRAN.R-project.org/package=cowplot

Elio Campitelli (2021). metR: Tools for Easier Analysis of Meteorological Fields. R package version 0.11.0. URL:https://github.com/eliocamp/metR

Hadley Wickham (2021). tidyr: Tidy Messy Data. R package version 1.1.4. https://CRAN.R-project.org/package=tidyr

Microsoft Corporation and Steve Weston (2020). doParallel: Foreach Parallel Adaptor for the 'parallel' Package. R package version 1.0.16. URL:https://CRAN.R-project.org/package=doParallel

Microsoft and Steve Weston (2020). foreach: Provides Foreach Looping Construct. R package version 1.5.1. URL:https://CRAN.R-project.org/package=foreach

Examples

lp_w_all <-
  c(
    ".4*x1 + .1*x2  - .1*x4 + .1*x5", # w = 1
    ".2 * x1 + .2 * x2  - .2 * x4 - .3 * x5"
  ) # w = 2
nlp_w_all <-
  c(
    "-.5*x1*x4  - .1*x2*x5", # w = 1
    "-.3*x1*x4 + .2*x2*x5"
  ) # w = 2
lp_y_all <- rep(".2*x1 + .3*x2 - .1*x3 - .1*x4 - .2*x5", 3)
nlp_y_all <- rep(".7*x1*x1  - .1*x2*x3", 3)
X_all <- c(
  "rnorm(0, 0.5)", # x1
  "rbeta(2, .4)", # x2
  "runif(0, 0.5)", # x3
  "rweibull(1,2)", # x4
  "rbinom(1, .4)" # x5
)

set.seed(111111)
data <- data_sim(
  sample_size = 300,
  n_trt = 3,
  x = X_all,
  lp_y = lp_y_all,
  nlp_y = nlp_y_all,
  align = FALSE,
  lp_w = lp_w_all,
  nlp_w = nlp_w_all,
  tau = c(-1.5, 0, 1.5),
  delta = c(0.5, 0.5),
  psi = 1
)
ce_estimate(
  y = data$y, x = data$covariates, w = data$w,
  ndpost = 100, method = "RA", estimand = "ATE"
)

[Package CIMTx version 1.2.0 Index]