RDHonest {RDHonest} | R Documentation |
Honest inference in RD
Description
Calculate estimators and bias-aware CIs for the sharp or fuzzy RD parameter, or for value of the conditional mean at a point.
Usage
RDHonest(
formula,
data,
subset,
weights,
cutoff = 0,
M,
kern = "triangular",
na.action,
opt.criterion = "MSE",
h,
se.method = "nn",
alpha = 0.05,
beta = 0.8,
J = 3,
sclass = "H",
T0 = 0,
point.inference = FALSE,
sigmaY2,
sigmaD2,
sigmaYD,
clusterid
)
Arguments
formula |
an object of class |
data |
optional data frame, list or environment (or object coercible by
|
subset |
optional vector specifying a subset of observations to be used in the fitting process. |
weights |
Optional vector of weights to weight the observations (useful for aggregated data). The weights are interpreted as the number of observations that each aggregated data point averages over. Disregarded if optimal kernel is used. |
cutoff |
specifies the RD cutoff in the running variable. For inference
at a point, specifies the point |
M |
Bound on second derivative of the conditional mean function, a
numeric vector of length one. For fuzzy RD, |
kern |
specifies the kernel function used in the local regression. It
can either be a string equal to |
na.action |
function which indicates what should happen when the data
contain |
opt.criterion |
Optimality criterion that the bandwidth is designed to optimize. The options are:
The methods use conditional variance given by |
h |
bandwidth, a scalar parameter. If not supplied, optimal bandwidth is
computed according to criterion given by |
se.method |
method for estimating standard error of the estimate, one of:
|
alpha |
determines confidence level, |
beta |
Determines quantile of excess length to optimize, if bandwidth
optimizes given quantile of excess length of one-sided confidence
intervals ( |
J |
Number of nearest neighbors, if |
sclass |
Smoothness class, either |
T0 |
Initial estimate of the treatment effect for calculating the optimal bandwidth. Only relevant for fuzzy RD. |
point.inference |
Do inference at a point determined by |
sigmaY2 |
Supply variance of outcome. Ignored when kernel is optimal. |
sigmaD2 |
Supply variance of treatment (fuzzy RD only). |
sigmaYD |
Supply covariance of treatment and outcome (fuzzy RD only). |
clusterid |
Vector specifying cluster membership. If supplied,
|
Details
The bandwidth is calculated to be optimal for a given performance criterion,
as specified by opt.criterion
. Alternatively, for local polynomial
estimators, the bandwidth can be specified by h
. For
kern="optimal"
, calculate optimal estimators under second-order Taylor
smoothness class (sharp RD only).
Value
Returns an object of class "RDResults"
. The function
print
can be used to obtain and print a summary of the results. An
object of class "RDResults"
is a list containing four components.
First, a data frame "coefficients"
containing the following
columns:
term
type of parameter being estimated
estimate
point estimate
std.error
standard error of
estimate
maximum.bias
maximum bias of
estimate
conf.low
,conf.high
lower (upper) end-point of a two-sided CI based on
estimate
conf.low.onesided
,conf.high.onesided
lower (upper) end-point of a one-sided CIs based on
estimate
bandwidth
bandwidth used. If
kern="optimal"
, the smoothing parametersbandwidth.m
andbandwidth.p
on either side of the cutoff are reported insteadeff.obs
number of effective observations
leverage
maximal leverage of
estimate
cv
critical value used to compute two-sided CIs
alpha
coverage level, as specified by option
alpha
method
sclass
is usedM
curvature bound used for worst-case bias calculations. For fuzzy RD, equals
(abs(estimate)*M.fs+M.rf)/first.stage
M.rf
,M.fs
curvature bound for the outcome (i.e. reduced-form) and first-stage regressions. Fuzzy RD only.
first.stage
estimate of the first-stage coefficient. Fuzzy RD only.
kernel
kernel used
p.value
p-value for testing the null of no effect
Second, a list called "data"
containing the data used for
estimation. This is useful mostly for internal calculations. Third, an
object of class "lm"
containing the local linear regression
estimates. Finally, a call
object containing the matched call
called "call"
.
If kern="optimal"
, the "lm"
object is empty, and the
numeric vectors "delta"
and "omega"
are returned in
addition. These correspond to the parameters in the modulus problem used
to compute the optimal estimation weights.
Note
subset
is evaluated in the same way as variables in formula
,
that is first in data
and then in the environment of formula
.
References
Timothy B. Armstrong and Michal Kolesár. Optimal inference in a class of regression models. Econometrica, 86(2):655–683, March 2018. doi:10.3982/ECTA14434
Timothy B. Armstrong and Michal Kolesár. Simple and honest confidence intervals in nonparametric regression. Quantitative Economics, 11(1):1–39, January 2020.
Michal Kolesár and Christoph Rothe. Inference in regression discontinuity designs with a discrete running variable. American Economic Review, 108(8):2277—-2304, August 2018. doi:10.1257/aer.20160945
Examples
RDHonest(voteshare ~ margin, data = lee08, kern = "uniform", M = 0.1, h = 10)
RDHonest(cn | retired ~ elig_year, data=rcp, cutoff=0, M=c(4, 0.4),
kern="triangular", opt.criterion="MSE", T0=0, h=3)
RDHonest(voteshare ~ margin, data = lee08, subset = margin>0,
kern = "uniform", M = 0.1, h = 10, point.inference=TRUE)