scdataMulti {scpi} | R Documentation |
Data Preparation for scest
or scpi
for Point Estimation and Inference Procedures Using Synthetic Control Methods.
Description
The command prepares the data to be used by scest
or scpi
to implement estimation
and inference procedures for Synthetic Control (SC) methods
in the general case of multiple treated units and staggered adoption. It is a generalization of scdata
, since this latter prepares
the data in the particular case of a single treated unit.
The names of the output matrices follow the terminology proposed in Cattaneo, Feng, Palomba and Titiunik (2022).
Companion Stata and Python packages are described in Cattaneo, Feng, Palomba, and Titiunik (2022).
Companion commands are: scdata for data preparation in the single treated unit case, scest for point estimation, scpi for inference procedures, scplot and scplotMulti for plots in the single and multiple treated unit(s) cases, respectively.
Related Stata, R, and Python packages useful for inference in SC designs are described in the following website:
https://nppackages.github.io/scpi/
For an introduction to synthetic control methods, see Abadie (2021) and references therein.
Usage
scdataMulti(
df,
id.var,
time.var,
outcome.var,
treatment.var,
features = NULL,
cov.adj = NULL,
cointegrated.data = FALSE,
post.est = NULL,
units.est = NULL,
donors.est = NULL,
anticipation = 0,
effect = "unit-time",
constant = FALSE,
verbose = TRUE,
sparse.matrices = FALSE
)
Arguments
df |
a dataframe object. |
id.var |
a character with the name of the variable containing units' IDs. The ID variable can be numeric or character. |
time.var |
a character with the name of the time variable. The time variable has to be numeric, integer, or Date. In
case |
outcome.var |
a character with the name of the outcome variable. The outcome variable has to be numeric. |
treatment.var |
a character with the name of the variable containing the treatment assignment of each unit. The referenced variable has to take value 1 if the unit is treated in that period and value 0 otherwise. Please notice that, as common in the SC literature, we presume that once a unit is treated it remains treated forever. If treatment.var does not comply with this requirement the command would not work as expected! |
features |
a list containing the names of the feature variables used for estimation.
If this option is not specified the default is |
cov.adj |
a list specifying the names of the covariates to be used for adjustment for each feature. If |
cointegrated.data |
a logical that indicates if there is a belief that the data is cointegrated or not. The default value is |
post.est |
a scalar specifying the number of post-treatment periods or a list specifying the periods for which treatment effects have to be computed for each treated unit. |
units.est |
a list specifying the treated units for which treatment effects have to be computed. |
donors.est |
a list specifying the donors units to be used. If the list has length 1, then all treated units share the same potential donors. Otherwise, if the user requires different donor pools for different treated units, the list must be of the same length of the number of treated units and each element has to be named with one treated unit's name as specified in id.var. |
anticipation |
a scalar that indicates the number of periods of potential anticipation effects. Default is 0. |
effect |
a string indicating the type of treatment effect to be computed. Options are: 'unit-time', which estimates treatment effects for each treated unit- post treatment period combination; 'unit', which estimates the treatment effect for each unit by averaging post-treatment features over time; 'time', which estimates the average treatment effect on the treated at various horizons. |
constant |
a logical which controls the inclusion of a constant term across features. The default value is |
verbose |
if |
sparse.matrices |
if |
Details
Covariate-adjustment. See the Details section in
scdata
for further information on how to specify covariate-adjustment feature-by-feature.Cointegration.
cointegrated.data
allows the user to model the belief that\mathbf{A}
and\mathbf{B}
form a cointegrated system. In practice, this implies that when dealing with the pseudo-true residuals\mathbf{u}
, the first-difference of\mathbf{B}
are used rather than the levels.Effect.
effect
allows the user to select between two causal quantities. The default option,effect = "unit-time"
, prepares the data for estimation of\tau_{ik},\quad k\geq, i=1,\ldots,N_1,
whereas the option
effect = "unit"
prepares the data for estimation of\tau_{\cdot k}=\frac{1}{N_1} \sum_{i=1}^{N_1} \tau_{i k}
which is the average effect on the treated unit across multiple post-treatment periods.
Value
The command returns an object of class 'scdataMulti' containing the following
A |
a matrix containing pre-treatment features of the treated units. |
B |
a matrix containing pre-treatment features of the control units. |
C |
a matrix containing covariates for adjustment. |
P |
a matrix whose rows are the vectors used to predict the out-of-sample series for the synthetic units. |
P.diff |
for internal use only. |
Y.df |
a dataframe containing the outcome variable for all units. |
Y.pre |
a matrix containing the pre-treatment outcome of the treated units. |
Y.post |
a matrix containing the post-treatment outcome of the treated units. |
Y.donors |
a matrix containing the pre-treatment outcome of the control units. |
specs |
a list containing some specifics of the data:
|
Author(s)
Matias Cattaneo, Princeton University. cattaneo@princeton.edu.
Yingjie Feng, Tsinghua University. fengyj@sem.tsinghua.edu.cn.
Filippo Palomba, Princeton University (maintainer). fpalomba@princeton.edu.
Rocio Titiunik, Princeton University. titiunik@princeton.edu.
References
Abadie, A. (2021). Using synthetic controls: Feasibility, data requirements, and methodological aspects. Journal of Economic Literature, 59(2), 391-425.
Cattaneo, M. D., Feng, Y., and Titiunik, R. (2021). Prediction intervals for synthetic control methods. Journal of the American Statistical Association, 116(536), 1865-1880.
Cattaneo, M. D., Feng, Y., Palomba F., and Titiunik, R. (2022). scpi: Uncertainty Quantification for Synthetic Control Methods, arXiv:2202.05984.
Cattaneo, M. D., Feng, Y., Palomba F., and Titiunik, R. (2022). Uncertainty Quantification in Synthetic Controls with Staggered Treatment Adoption, arXiv:2210.05026.
See Also
scdata
, scest
, scpi
, scplot
, scplotMulti
Examples
datager <- scpi_germany
datager$tr_id <- 0
datager$tr_id[(datager$country == "West Germany" & datager$year > 1990)] <- 1
datager$tr_id[(datager$country == "Italy" & datager$year > 1992)] <- 0
outcome.var <- "gdp"
id.var <- "country"
treatment.var <- "tr_id"
time.var <- "year"
df.unit <- scdataMulti(datager, id.var = id.var, outcome.var = outcome.var,
treatment.var = treatment.var,
time.var = time.var, features = list(c("gdp", "trade")),
cointegrated.data = TRUE, constant = TRUE)