data_sim {CIMTx} | R Documentation |

The function `data_sim`

simulate data for binary outcome with multiple treatments. Users can adjust the following 7 design factors: (1) sample size, (2) ratio of units across treatment groups, (3) whether the treatment assignment model and the outcome generating model are linear or nonlinear, (4) whether the covariates that best predict the treatment also predict the outcome well, (5) whether the response surfaces are parallel across treatment groups, (6) outcome prevalence, and (7) degree of covariate overlap.

data_sim( sample_size, n_trt, X, lp_y, nlp_y, align = TRUE, tau, delta, psi, lp_w, nlp_w )

`sample_size` |
A numeric value indicating the total number of units. |

`n_trt` |
A numeric value indicating the number of treatments. |

`X` |
A vector of characters representing covariates, with each covariate being generated from the standard probability |

`lp_y` |
A vector of characters of length |

`nlp_y` |
A vector of characters of length |

`align` |
A logical indicating whether the predictors in the treatment assignment model are the same as the predictors for the outcome generating model. The default is |

`tau` |
A numeric vector of length |

`delta` |
A numeric vector of length |

`psi` |
A numeric value for the parameter governing the sparsity of covariate overlap. |

`lp_w` |
is a vector of characters of length |

`nlp_w` |
is a vector of characters of length |

A list with 7 elements for simulated data. It contains

`covariates:` |
X matrix |

`w:` |
treatment indicators |

`y:` |
observed binary outcomes |

`y_prev:` |
outcome prevalence rates |

`ratio_of_units:` |
the proportions of units in each treatment group |

`overlap_fig:` |
the visualization of covariate overlap via boxplots of the distributions of true GPS |

`Y_true:` |
simulated true outcome in each treatment group |

Hu, L., Ji, J. (2021). CIMTx: An R package for causal inference with multiple treatments using observational data. arXiv:2110.10276

library(CIMTx) lp_w_all <- c(".4*x1 + .1*x2 - .1*x4 + .1*x5", # w = 1 ".2 * x1 + .2 * x2 - .2 * x4 - .3 * x5") # w = 2 nlp_w_all <- c("-.5*x1*x4 - .1*x2*x5", # w = 1 "-.3*x1*x4 + .2*x2*x5")# w = 2 lp_y_all <- rep(".2*x1 + .3*x2 - .1*x3 - .1*x4 - .2*x5", 3) nlp_y_all <- rep(".7*x1*x1 - .1*x2*x3", 3) X_all <- c( "rnorm(300, 0, 0.5)",# x1 "rbeta(300, 2, .4)", # x2 "runif(300, 0, 0.5)",# x3 "rweibull(300,1,2)", # x4 "rbinom(300, 1, .4)"# x5 ) set.seed(111111) data <- data_sim( sample_size = 300, n_trt = 3, X = X_all, lp_y = lp_y_all, nlp_y = nlp_y_all, align = FALSE, lp_w = lp_w_all, nlp_w = nlp_w_all, tau = c(-1.5,0,1.5), delta = c(0.5,0.5), psi = 1 )

[Package *CIMTx* version 1.1.0 Index]