didcontDMLpanel {causalweight} | R Documentation |
Continuous Difference-in-Differences using Double Machine Learning for Panel Data
Description
This function estimates the average treatment effect on the treated of a continuously distributed treatment in panel data based on a Difference-in-Differences (DiD) approach using double machine learning to control for time-varying confounders in a data-driven manner. It supports estimation under various machine learning methods and uses k-fold cross-fitting.
Usage
didcontDMLpanel(
ydiff,
d,
t,
dtreat,
dcontrol,
t1 = 1,
controls,
MLmethod = "lasso",
psmethod = 1,
trim = 0.1,
lognorm = FALSE,
bw = NULL,
bwfactor = 0.7,
cluster = NULL,
k = 3
)
Arguments
ydiff |
Outcome difference between two pre- and post-treatment periods. Should not contain missing values. |
d |
Treatment variable in the treatment period of interest. Should be continuous and not contain missing values. |
t |
Time variable indicating outcome periods. Should not contain missing values. |
dtreat |
Value of the treatment under treatment (in the treatment period of interest). This value would be 1 for binary treatments. |
dcontrol |
Value of the treatment under control (in the treatment period of interest). This value would be 0 for binary treatments. |
t1 |
Value indicating the post-treatment outcome period in which the effect is evaluated, which is the later of the two periods used to generate the outcome difference in |
controls |
Covariates and/or previous treatment history to be controlled for. Should not contain missing values. |
MLmethod |
Machine learning method for estimating nuisance parameters using the |
psmethod |
Method for computing generalized propensity scores. Set to 1 for estimating conditional treatment densities using the treatment as dependent variable, or 2 for using the treatment kernel weights as dependent variable. Default is 1. |
trim |
Trimming threshold (in percentage) for discarding observations with too much influence within any subgroup defined by the treatment group and time. Default is 0.1. |
lognorm |
Logical indicating if log-normal transformation should be applied when estimating conditional treatment densities using the treatment as dependent variable. Default is FALSE. |
bw |
Bandwidth for kernel density estimation. Default is NULL, implying that the bandwidth is calculated based on the rule-of-thumb. |
bwfactor |
Factor by which the bandwidth is multiplied. Default is 0.7 (undersmoothing). |
cluster |
Optional clustering variable for calculating standard errors. |
k |
Number of folds in k-fold cross-fitting. Default is 3. |
Details
This function estimates the Average Treatment Effect on the Treated (ATET) by Difference-in-Differences in panel data while controlling for confounders using double machine learning. The function supports different machine learning methods for estimating nuisance parameters and performs k-fold cross-fitting to improve estimation accuracy. The function also handles binary and continuous outcomes, and provides options for trimming and bandwidth adjustments in kernel density estimation.
Value
A list with the following components:
ATET
: Estimate of the Average Treatment Effect on the Treated.
se
: Standard error of the ATET estimate.
trimmed
: Number of discarded (trimmed) observations.
pval
: P-value.
pscores
: Propensity scores (2 columns): under treatment, under control.
outcomepred
: Conditional outcome predictions.
References
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., Robins, J. (2018): "Double/debiased machine learning for treatment and structural parameters", The Econometrics Journal, 21, C1-C68.
Haddad, M., Huber, M., Medina-Reyes, J., Zhang, L. (2024): "Difference-in-Differences under time-varying continuous treatments based on double machine learning"
Examples
## Not run:
# Example with simulated data
n=1000
x=0.5*rnorm(n)
u=runif(n,0,2)
d=x+u+rnorm(n)
y0=u+rnorm(n)
y1=2*d+x+u+rnorm(n)
t=rep(1,n)
# true effect is 2
results=didcontDMLpanel(ydiff=y1-y0, d=d, t=t, dtreat=1, dcontrol=0, controls=x, MLmethod="lasso")
cat("ATET: ", round(results$ATET, 3), ", Standard error: ", round(results$se, 3))
## End(Not run)