data_sim {CIMTx} | R Documentation |
Simulate data for binary outcome with multiple treatments
Description
The function data_sim
simulate data for binary outcome with
multiple treatments. Users can adjust the following 7 design factors:
(1) sample size, (2) ratio of units across treatment groups,
(3) whether the treatment assignment model and the outcome generating model
are linear or nonlinear, (4) whether the covariates that best predict
the treatment also predict the outcome well,
(5) whether the response surfaces are parallel across treatment groups,
(6) outcome prevalence, and (7) degree of covariate overlap.
Usage
data_sim(
sample_size,
n_trt = 3,
x = "rnorm(0, 1)",
lp_y = rep("x1", 3),
nlp_y = NULL,
align = TRUE,
tau = c(0, 0, 0),
delta = c(0, 0),
psi = 1,
lp_w,
nlp_w
)
Arguments
sample_size |
A numeric value indicating the total number of units. |
n_trt |
A numeric value indicating the number of treatments. The default is set to 3. |
x |
A vector of characters representing covariates,
with each covariate being generated from the standard probability.
The default is set to "rnorm(0, 1)".
|
lp_y |
A vector of characters of length |
nlp_y |
A vector of characters of length |
align |
A logical indicating whether the predictors in the
treatment assignment model are the same as the predictors for
the outcome generating model.
The default is |
tau |
A numeric vector of length |
delta |
A numeric vector of length |
psi |
A numeric value for the parameter governing the sparsity of covariate overlap. Higher values mean weaker covariate overlap; lower values mean stronger covariate overlap. The default is set to 1, which corresponds to a moderate covariate overlap. |
lp_w |
is a vector of characters of length |
nlp_w |
is a vector of characters of length |
Value
A list with 7 elements for simulated data. It contains
covariates: |
x matrix |
w: |
treatment indicators |
y: |
observed binary outcomes |
y_prev: |
outcome prevalence rates |
ratio_of_units: |
the proportions of units in each treatment group |
overlap_fig: |
the visualization of covariate overlap via boxplots of the distributions of true GPS |
y_true: |
simulated true outcome in each treatment group |
References
Hadley Wickham (2019). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.4.0. URL:https://CRAN.R-project.org/package=stringr
Hadley Wickham, Romain François, Lionel Henry and Kirill Müller (2021). dplyr: A Grammar of Data Manipulation. R package version 1.0.7. URL: https://CRAN.R-project.org/package=dplyr
Examples
library(CIMTx)
lp_w_all <-
c(
".4*x1 + .1*x2 - .1*x4 + .1*x5", # w = 1
".2 * x1 + .2 * x2 - .2 * x4 - .3 * x5"
) # w = 2
nlp_w_all <-
c(
"-.5*x1*x4 - .1*x2*x5", # w = 1
"-.3*x1*x4 + .2*x2*x5"
) # w = 2
lp_y_all <- rep(".2*x1 + .3*x2 - .1*x3 - .1*x4 - .2*x5", 3)
nlp_y_all <- rep(".7*x1*x1 - .1*x2*x3", 3)
X_all <- c(
"rnorm(0, 0.5)", # x1
"rbeta(2,0.4)", # x2
"runif(0, 0.5)", # x3
"rweibull(1,2)", # x4
"rbinom(1,0.4)" # x5
)
set.seed(111111)
data <- data_sim(
sample_size = 300,
n_trt = 3,
x = X_all,
lp_y = lp_y_all,
nlp_y = nlp_y_all,
align = FALSE,
lp_w = lp_w_all,
nlp_w = nlp_w_all,
tau = c(-1.5, 0, 1.5),
delta = c(0.5, 0.5),
psi = 1
)