SimDiD {DiDforBigData} | R Documentation |
DiD data simulator with staggered treatment.
Description
Simulate data from the model Y_it = alpha_i + mu_t + ATT*(t >= G_i) + epsilon_it, where i is individual, t is year, and G_i is the cohort. The ATT formula is ATTat0 + EventTime*ATTgrowth + \*cohort_counter\*ATTcohortdiff, where cohort_counter is the order of treated cohort (first, second, etc.).
Usage
SimDiD(
seed = 1,
sample_size = 100,
cohorts = c(2007, 2010, 2012),
ATTat0 = 1,
ATTgrowth = 1,
ATTcohortdiff = 0.5,
anticipation = 0,
minyear = 2003,
maxyear = 2013,
idvar = 1,
yearvar = 1,
shockvar = 1,
indivAR1 = FALSE,
time_covars = FALSE,
clusters = FALSE,
markets = FALSE,
randomNA = FALSE,
missingCohorts = NULL
)
Arguments
seed |
Set the random seed. Default is seed=1. |
sample_size |
Number of individuals. Default is sample_size=100. |
cohorts |
Vector of years at which treatment onset occurs. Default is cohorts=c(2007,2010,2012). |
ATTat0 |
Treatment effect at event time 0. Default is 1. |
ATTgrowth |
Increment in the ATT for each event time after 0. Default is 1. |
ATTcohortdiff |
Incrememnt in the ATT for each cohort. Default is 0.5. |
anticipation |
Number of years prior to cohort to allow 50% treatment effects. Default is anticipation=0. |
minyear |
Minimum calendar year to include in the data. Default is minyear=2003. |
maxyear |
Maximum calendar year to include in the data. Default is maxyear=2013. |
idvar |
Variance of individual fixed effects (alpha_i). Default is idvar=1. |
yearvar |
Variance of year effects (mu_i). Default is yearvar=1. |
shockvar |
Variance of idiosyncratic shocks (epsilon_it). Default is shockvar=1. |
indivAR1 |
Each individual's shocks follow an AR(1) process. Default is FALSE. |
time_covars |
Add 2 time-varying covariates, called "X1" and "X2". Default is FALSE. |
clusters |
Add 10 randomly assigned clusters, with cluster-specific AR(1) shocks. Default is FALSE. |
markets |
Add 10 randomly assigned markets, with market-specific shocks that are systematically greater for markets that are treated earlier. Default is FALSE. |
randomNA |
If TRUE, randomly assign the outcome variable with missing values (NA) in some cases. Default is FALSE. |
missingCohorts |
If set to a particular cohort (or vector of cohorts), all of the outcomes for that cohort at event time -1 will be set to missing. Default is NULL. |
Value
A list with two data.tables. The first data.table is simulated data with variables (id, year, cohort, Y), where Y is the outcome variable. The second data.table contains the true ATT values, both at the (event,cohort) level and by event averaging across cohorts.
Examples
# simulate data with default options
SimDiD()