lpbwselect {nprobust}R Documentation

Bandwidth Selection Procedures for Local Polynomial Regression Estimation and Inference

Description

lpbwselect implements bandwidth selectors for local polynomial regression point estimators and inference procedures developed in Calonico, Cattaneo and Farrell (2018). See also Calonico, Cattaneo and Farrell (2020) for related optimality results. It also implements other bandwidth selectors available in the literature. See Wand and Jones (1995) and Fan and Gijbels (1996) for background references.

Companion commands: lprobust for local polynomial point estimation and inference procedures.

A detailed introduction to this command is given in Calonico, Cattaneo and Farrell (2019). For more details, and related Stata and R packages useful for empirical analysis, visit https://nppackages.github.io/.

Usage

lpbwselect(y, x, eval = NULL, neval = NULL, p = NULL, deriv = NULL,
kernel = "epa", bwselect = "mse-dpi", bwcheck = 21, bwregul = 1, 
imsegrid = 30, vce = "nn", cluster = NULL,
nnmatch = 3, interior = FALSE, subset = NULL)

Arguments

y

dependent variable.

x

independent variable.

eval

vector of evaluation point(s). By default it uses 30 equally spaced points over to support of x.

neval

number of quantile-spaced evaluation points on support of x. Default is neval=30.

p

polynomial order used to construct point estimator; default is p = 1 (local linear regression).

deriv

derivative order of the regression function to be estimated. Default is deriv=0 (regression function).

kernel

kernel function used to construct local polynomial estimators. Options are epa for the epanechnikov kernel, tri for the triangular kernel, uni for the uniform kernel and gau for the gaussian kernel. Default is kernel = epa.

bwselect

bandwidth selection procedure to be used. Options are:

mse-dpi second-generation DPI implementation of MSE-optimal bandwidth. Default option.

mse-rot ROT implementation of MSE-optimal bandwidth.

imse-dpi second-generation DPI implementation of IMSE-optimal bandwidth (computed using grid of evaluation points selected).

imse-rot ROT implementation of IMSE-optimal bandwidth (computed using grid of evaluation points selected).

ce-dpi second generation DPI implementation of CE-optimal bandwidth.

ce-rot ROT implementation of CE-optimal bandwidth.

all reports all available bandwidth selection procedures.

Note: MSE = Mean Square Error; IMSE = Integrated Mean Squared Error; CE = Coverage Error; DPI = Direct Plug-in; ROT = Rule-of-Thumb. For details on implementation see Calonico, Cattaneo and Farrell (2019).

bwcheck

if a positive integer is provided, then the selected bandwidth is enlarged so that at least bwcheck effective observations are available at each evaluation point. Default is bwcheck = 21.

bwregul

specifies scaling factor for the regularization term added to the denominator of bandwidth selectors. Setting bwregul = 0 removes the regularization term from the bandwidth selectors. Default is bwregul = 1.

imsegrid

number of evaluations points used to compute the IMSE bandwidth selector. Default is imsegrid = 30.

vce

procedure used to compute the variance-covariance matrix estimator. Options are:

nn heteroskedasticity-robust nearest neighbor variance estimator with nnmatch the (minimum) number of neighbors to be used. Default choice.

hc0 heteroskedasticity-robust plug-in residuals variance estimator without weights.

hc1 heteroskedasticity-robust plug-in residuals variance estimator with hc1 weights.

hc2 heteroskedasticity-robust plug-in residuals variance estimator with hc2 weights.

hc3 heteroskedasticity-robust plug-in residuals variance estimator with hc3 weights.

cluster

indicates the cluster ID variable used for cluster-robust variance estimation with degrees-of-freedom weights. By default it is combined with vce=nn for cluster-robust nearest neighbor variance estimation. Another option is plug-in residuals combined with vce=hc1.

nnmatch

to be combined with for vce=nn for heteroskedasticity-robust nearest neighbor variance estimator with nnmatch indicating the minimum number of neighbors to be used. Default is nnmatch=3

.

interior

if TRUE, all evaluation points are assumed to be interior points. This option affects only data-driven bandwidth selection via lpbwselect. Default is interior = FALSE.

subset

optional rule specifying a subset of observations to be used.

Value

Estimate

A matrix containing grid (grid points), h and b (bandwidths), N (sample size)

opt

A list containing options passed to the function.

Author(s)

Sebastian Calonico, Columbia University, New York, NY. sebastian.calonico@columbia.edu.

Matias D. Cattaneo, Princeton University, Princeton, NJ. cattaneo@princeton.edu.

Max H. Farrell, University of Chicago, Chicago, IL. max.farrell@chicagobooth.edu.

References

Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2018. On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference. Journal of the American Statistical Association, 113(522): 767-779. doi: 10.1080/01621459.2017.1285776.

Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2019. nprobust: Nonparametric Kernel-Based Estimation and Robust Bias-Corrected Inference. Journal of Statistical Software, 91(8): 1-33. doi: 10.18637/jss.v091.i08.

Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2020. Coverage Error Optimal Confidence Intervals for Local Polynomial Regression. Working Paper.

Fan, J., and Gijbels, I. 1996. Local polynomial modelling and its applications, London: Chapman and Hall.

Wand, M., and Jones, M. 1995. Kernel Smoothing, Florida: Chapman & Hall/CRC.

See Also

lprobust

Examples

x   <- runif(500)
y   <- sin(4*x) + rnorm(500)
est <- lpbwselect(y,x)
summary(est)

[Package nprobust version 0.4.0 Index]