R: Pre-Test of Conditional Parallel Trends Assumption

conditional_did_pretest {did}

R Documentation

Pre-Test of Conditional Parallel Trends Assumption

Description

An integrated moments test for the conditional parallel trends assumption holding in all pre-treatment time periods for all groups

Usage

conditional_did_pretest(
  yname,
  tname,
  idname = NULL,
  gname,
  xformla = NULL,
  data,
  panel = TRUE,
  allow_unbalanced_panel = FALSE,
  control_group = c("nevertreated", "notyettreated"),
  weightsname = NULL,
  alp = 0.05,
  bstrap = TRUE,
  cband = TRUE,
  biters = 1000,
  clustervars = NULL,
  est_method = "ipw",
  print_details = FALSE,
  pl = FALSE,
  cores = 1
)

Arguments

`yname`	The name of the outcome variable
`tname`	The name of the column containing the time periods
`idname`	The individual (cross-sectional unit) id name
`gname`	The name of the variable in `data` that contains the first period when a particular observation is treated. This should be a positive number for all observations in treated groups. It defines which "group" a unit belongs to. It should be 0 for units in the untreated group.
`xformla`	A formula for the covariates to include in the model. It should be of the form `~ X1 + X2`. Default is NULL which is equivalent to `xformla=~1`. This is used to create a matrix of covariates which is then passed to the 2x2 DID estimator chosen in `est_method`.
`data`	The name of the data.frame that contains the data
`panel`	Whether or not the data is a panel dataset. The panel dataset should be provided in long format – that is, where each row corresponds to a unit observed at a particular point in time. The default is TRUE. When is using a panel dataset, the variable `idname` must be set. When `panel=FALSE`, the data is treated as repeated cross sections.
`allow_unbalanced_panel`	Whether or not function should "balance" the panel with respect to time and id. The default values if `FALSE` which means that `att_gt()` will drop all units where data is not observed in all periods. The advantage of this is that the computations are faster (sometimes substantially).
`control_group`	Which units to use the control group. The default is "nevertreated" which sets the control group to be the group of units that never participate in the treatment. This group does not change across groups or time periods. The other option is to set `group="notyettreated"`. In this case, the control group is set to the group of units that have not yet participated in the treatment in that time period. This includes all never treated units, but it includes additional units that eventually participate in the treatment, but have not participated yet.
`weightsname`	The name of the column containing the sampling weights. If not set, all observations have same weight.
`alp`	the significance level, default is 0.05
`bstrap`	Boolean for whether or not to compute standard errors using the multiplier bootstrap. If standard errors are clustered, then one must set `bstrap=TRUE`. Default is `TRUE` (in addition, cband is also by default `TRUE` indicating that uniform confidence bands will be returned. If bstrap is `FALSE`, then analytical standard errors are reported.
`cband`	Boolean for whether or not to compute a uniform confidence band that covers all of the group-time average treatment effects with fixed probability `1-alp`. In order to compute uniform confidence bands, `bstrap` must also be set to `TRUE`. The default is `TRUE`.
`biters`	The number of bootstrap iterations to use. The default is 1000, and this is only applicable if `bstrap=TRUE`.
`clustervars`	A vector of variables names to cluster on. At most, there can be two variables (otherwise will throw an error) and one of these must be the same as idname which allows for clustering at the individual level. By default, we cluster at individual level (when `bstrap=TRUE`).
`est_method`	the method to compute group-time average treatment effects. The default is "dr" which uses the doubly robust approach in the `DRDID` package. Other built-in methods include "ipw" for inverse probability weighting and "reg" for first step regression estimators. The user can also pass their own function for estimating group time average treatment effects. This should be a function `f(Y1,Y0,treat,covariates)` where `Y1` is an `n` x `1` vector of outcomes in the post-treatment outcomes, `Y0` is an `n` x `1` vector of pre-treatment outcomes, `treat` is a vector indicating whether or not an individual participates in the treatment, and `covariates` is an `n` x `k` matrix of covariates. The function should return a list that includes `ATT` (an estimated average treatment effect), and `inf.func` (an `n` x `1` influence function). The function can return other things as well, but these are the only two that are required. `est_method` is only used if covariates are included.
`print_details`	Whether or not to show details/progress of computations. Default is `FALSE`.
`pl`	Whether or not to use parallel processing
`cores`	The number of cores to use for parallel processing

Value

an MP.TEST object

References

Callaway, Brantly and Sant'Anna, Pedro H. C. "Difference-in-Differences with Multiple Time Periods and an Application on the Minimum Wage and Employment." Working Paper https://arxiv.org/abs/1803.09015v2 (2018).

Examples

## Not run: 
data(mpdta)
pre.test <- conditional_did_pretest(yname="lemp",
                                    tname="year",
                                    idname="countyreal",
                                    gname="first.treat",
                                    xformla=~lpop,
                                    data=mpdta)
summary(pre.test)

## End(Not run)

[Package did version 2.1.2 Index]