simulate_confounded_data {independenceWeights}R Documentation

Simulation of confounded data with a continuous treatment

Description

Simulates confounded data with continuous treatment based on Vegetabile et al's simulation

Usage

simulate_confounded_data(
  seed = 1,
  nobs = 1000,
  MX1 = -0.5,
  MX2 = 1,
  MX3 = 0.3,
  A_effect = TRUE
)

Arguments

seed

random seed for reproducibility

nobs

number of observations

MX1

the mean of the first covariate. Defaults to -0.5, the value used in the simulations of Vegetabile, et al (2021).

MX2

the mean of the second and fourth covariates. Defaults to 1, the value used in the simulations of Vegetabile, et al (2021).

MX3

the probability that the fifth covariate (a binary covariate) is equal to 1. Defaults to 0.3, the value used in the simulations of Vegetabile, et al (2021).

A_effect

whether (TRUE) or not (FALSE) the treatment has a causal effect on the outcome. If TRUE, the setting used is that of the main text of Vegetabile, et al (2021). If FALSE, the setting is that used in the Appendix of Vegetabile, et al (2021).

Value

An list with elements:

data

A simulated dataset with nobs rows

true_adrf

A function that inputs values of the treatment A and outputs the true ADRF, E(Y(A)), of the data-generating mechanism used to generate data.

A list with the following elements

data

a data.frame with the response (Y), treatment (A), confounders (Z1 to Z5), and true average dose response function truth

true_adrf

a function; true average dose response function

original_covariates

original, untransformed covariates in the simulation setup. Do not use, as it makes the simulation setup significantly easier.

References

Vegetabile, B. G., Griffin, B. A., Coffman, D. L., Cefalu, M., Robbins, M. W., and McCaffrey, D. F. (2021). Nonparametric estimation of population average dose-response curves using entropy balancing weights for continuous exposures. Health Services and Outcomes Research Methodology, 21(1), 69-110.

Examples


simdat <- simulate_confounded_data(seed = 999, nobs = 500)

str(simdat$data)

A <- simdat$data$A
y <- simdat$data$Y

trt_vec <- seq(min(simdat$data$A), max(simdat$data$A), length.out=500)
ylims <- range(c(simdat$data$Y, simdat$true_adrf(trt_vec)))
plot(x = simdat$data$A, y = simdat$data$Y, ylim = ylims)
lines(x = trt_vec, y = simdat$true_adrf(trt_vec), col = "blue", lwd=2)

## naive estimate of ADRF without weights
adrf_hat_unwtd <- weighted_kernel_est(A, y, rep(1, length(y)), trt_vec)$est
lines(x = trt_vec, y = adrf_hat_unwtd, col = "green", lwd=2)



[Package independenceWeights version 0.0.1 Index]