binsregselect {binsreg} | R Documentation |
Data-Driven IMSE-Optimal Partitioning/Binning Selection for Binscatter
Description
binsregselect
implements data-driven procedures for selecting the number of bins for binscatter
estimation. The selected number is optimal in minimizing integrated mean squared error (IMSE).
Usage
binsregselect(y, x, w = NULL, data = NULL, deriv = 0, bins = NULL,
pselect = NULL, sselect = NULL, binspos = "qs", nbins = NULL,
binsmethod = "dpi", nbinsrot = NULL, simsgrid = 20, savegrid = F,
vce = "HC1", useeffn = NULL, randcut = NULL, cluster = NULL,
dfcheck = c(20, 30), masspoints = "on", weights = NULL,
subset = NULL, norotnorm = F, numdist = NULL, numclust = NULL)
Arguments
y |
outcome variable. A vector. |
x |
independent variable of interest. A vector. |
w |
control variables. A matrix, a vector or a |
data |
an optional data frame containing variables used in the model. |
deriv |
derivative order of the regression function for estimation, testing and plotting.
The default is |
bins |
a vector. |
pselect |
vector of numbers within which the degree of polynomial |
sselect |
vector of numbers within which the number of smoothness constraints |
binspos |
position of binning knots. The default is |
nbins |
number of bins for degree/smoothness selection. If |
binsmethod |
method for data-driven selection of the number of bins. The default is |
nbinsrot |
initial number of bins value used to construct the DPI number of bins selector. If not specified, the data-driven ROT selector is used instead. |
simsgrid |
number of evaluation points of an evenly-spaced grid within each bin used for evaluation of
the supremum (infimum or Lp metric) operation needed to construct confidence bands and hypothesis testing
procedures. The default is |
savegrid |
if true, a data frame produced containing grid. |
vce |
procedure to compute the variance-covariance matrix estimator. Options are
|
useeffn |
effective sample size to be used when computing the (IMSE-optimal) number of bins. This option is useful for extrapolating the optimal number of bins to larger (or smaller) datasets than the one used to compute it. |
randcut |
upper bound on a uniformly distributed variable used to draw a subsample for bins/degree/smoothness selection.
Observations for which |
cluster |
cluster ID. Used for compute cluster-robust standard errors. |
dfcheck |
adjustments for minimum effective sample size checks, which take into account number of unique
values of |
masspoints |
how mass points in
|
weights |
an optional vector of weights to be used in the fitting process. Should be |
subset |
optional rule specifying a subset of observations to be used. |
norotnorm |
if true, a uniform density rather than normal density used for ROT selection. |
numdist |
number of distinct values for selection. Used to speed up computation. |
numclust |
number of clusters for selection. Used to speed up computation. |
Value
nbinsrot.poly |
ROT number of bins, unregularized. |
nbinsrot.regul |
ROT number of bins, regularized. |
nbinsrot.uknot |
ROT number of bins, unique knots. |
nbinsdpi |
DPI number of bins. |
nbinsdpi.uknot |
DPI number of bins, unique knots. |
prot.poly |
ROT degree of polynomials, unregularized. |
prot.regul |
ROT degree of polynomials, regularized. |
prot.uknot |
ROT degree of polynomials, unique knots. |
pdpi |
DPI degree of polynomials. |
pdpi.uknot |
DPI degree of polynomials, unique knots. |
srot.poly |
ROT number of smoothness constraints, unregularized. |
srot.regul |
ROT number of smoothness constraints, regularized. |
srot.uknot |
ROT number of smoothness constraints, unique knots. |
sdpi |
DPI number of smoothness constraints. |
sdpi.uknot |
DPI number of smoothness constraints, unique knots. |
imse.var.rot |
Variance constant in IMSE expansion, ROT selection. |
imse.bsq.rot |
Bias constant in IMSE expansion, ROT selection. |
imse.var.dpi |
Variance constant in IMSE expansion, DPI selection. |
imse.bsq.dpi |
Bias constant in IMSE expansion, DPI selection. |
int.result |
Intermediate results, including a matrix of degree and smoothness ( |
opt |
A list containing options passed to the function, as well as total sample size |
data.grid |
A data frame containing grid. |
Author(s)
Matias D. Cattaneo, Princeton University, Princeton, NJ. cattaneo@princeton.edu.
Richard K. Crump, Federal Reserve Bank of New York, New York, NY. richard.crump@ny.frb.org.
Max H. Farrell, UC Santa Barbara, Santa Barbara, CA. mhfarrell@gmail.com.
Yingjie Feng (maintainer), Tsinghua University, Beijing, China. fengyingjiepku@gmail.com.
References
Cattaneo, M. D., R. K. Crump, M. H. Farrell, and Y. Feng. 2024a: On Binscatter. American Economic Review 114(5): 1488-1514.
Cattaneo, M. D., R. K. Crump, M. H. Farrell, and Y. Feng. 2024b: Nonlinear Binscatter Methods. Working Paper.
Cattaneo, M. D., R. K. Crump, M. H. Farrell, and Y. Feng. 2024c: Binscatter Regressions. Working Paper.
See Also
Examples
x <- runif(500); y <- sin(x)+rnorm(500)
est <- binsregselect(y,x)
summary(est)