lpcde {lpcde} | R Documentation |
Local polynomial conditional density estimation
Description
lpcde
implements the local polynomial regression based
conditional density (and derivatives). The estimator proposed in
(Cattaneo et al. 2024).
Robust bias-corrected inference methods, both pointwise (confidence intervals) and
uniform (confidence bands), are also implemented.
Usage
lpcde(
x_data,
y_data,
y_grid = NULL,
x = NULL,
bw = NULL,
p = NULL,
q = NULL,
p_RBC = NULL,
q_RBC = NULL,
mu = NULL,
nu = NULL,
rbc = TRUE,
ng = NULL,
normalize = FALSE,
nonneg = FALSE,
grid_spacing = "",
kernel_type = c("epanechnikov", "triangular", "uniform"),
bw_type = NULL
)
Arguments
x_data |
Numeric matrix/data frame, the raw data of covariates. |
y_data |
Numeric matrix/data frame, the raw data of independent. |
y_grid |
Numeric, specifies the grid of evaluation points in the y-direction. When set to default, grid points will be chosen as 0.05-0.95 percentiles of the data, with a step size of 0.05 in y-direction. |
x |
Numeric, specifies the grid of evaluation points in the x-direction. When set to default, the evaluation point will be chosen as the median of the x data. |
bw |
Numeric, specifies the bandwidth used for estimation. Can be (1) a positive
scalar (common bandwidth for all grid points); or (2) a positive numeric vector/matrix
specifying bandwidths for each grid point (should be the same dimension as |
p |
Nonnegative integer, specifies the order of the local polynomial for |
q |
Nonnegative integer, specifies the order of the local polynomial for |
p_RBC |
Nonnegative integer, specifies the order of the local polynomial for |
q_RBC |
Nonnegative integer, specifies the order of the local polynomial for |
mu |
Nonnegative integer, specifies the derivative with respect to |
nu |
Nonnegative integer, specifies the derivative with respect to |
rbc |
Boolean. TRUE (default) for rbc calcuations, required for valid uniform inference. |
ng |
Int, number of grid points to be used. generates evenly space points over the support of the data. |
normalize |
Boolean, False (default) returns original estimator, True normalizes estimates to integrate to 1. |
nonneg |
Boolean, False (default) returns original estimator, True returns maximum of estimate and 0. |
grid_spacing |
String, If equal to "quantile" will generate quantile-spaced grid evaluation points, otherwise will generate equally spaced points. |
kernel_type |
String, specifies the kernel function, should be one of
|
bw_type |
String, specifies the method for data-driven bandwidth selection. This option will be
ignored if |
Details
Bias correction is only used for the construction of confidence intervals/bands, but not for point estimation.
The point estimates, denoted by est
, are constructed using local polynomial estimates of order p
and q
,
while the centering of the confidence intervals/bands, denoted by est_RBC
,
are constructed using local polynomial estimates of order
p_RBC
and q_RBC
. The confidence intervals/bands take the form:
[est_RBC - cv * SE(est_RBC) , est_RBC + cv * SE(est_RBC)]
, where cv
denotes
the appropriate critical value and SE(est_RBC)
denotes an standard error estimate for
the centering of the confidence interval/band. As a result, the confidence intervals/bands
may not be centered at the point estimates because they have been bias-corrected.
Setting p_RBC
equal to p
and q_RBC
to q
, results on centered
at the point estimate confidence intervals/bands, but requires undersmoothing for
valid inference (i.e., (I)MSE-optimal bandwdith for the density point estimator cannot
be used). Hence the bandwidth would need to be specified manually when q=p
,
and the point estimates will not be (I)MSE optimal. See Cattaneo, Jansson and Ma
(2020a, 2020b) for details, and also Calonico, Cattaneo, and Farrell (2018, 2020)
for robust bias correction methods.
Sometimes the density point estimates may lie outside
of the confidence intervals/bands, which can happen if the underlying distribution exhibits
high curvature at some evaluation point(s). One possible solution in this case is to
increase the polynomial order p
or to employ a smaller bandwidth.
Value
Estimate |
A matrix containing (1) |
CovMat |
The variance-covariance matrix corresponding to |
opt |
A list containing options passed to the function. |
Author(s)
Matias D. Cattaneo, Princeton University. cattaneo@princeton.edu.
Rajita Chandak (maintainer), Princeton University. rchandak@princeton.edu.
Michael Jansson, University of California Berkeley. mjansson@econ.berkeley.edu.
Xinwei Ma, University of California San Diego. x1ma@ucsd.edu.
References
Cattaneo MD, Chandak R, Jansson M, Ma X (2024).
“Local Polynomial Conditional Density Estimators.”
Bernoulli.
Calonico S, Cattaneo MD, Farrell MH (2018).
“On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference.”
Journal of the American Statistical Association, 113(522), 767–779.
Calonico S, Cattaneo MD, Farrell MH (2022).
“Coverage Error Optimal Confidence Intervals for Local Polynomial Regression.”
Bernoulli, 28(4), 2998–3022.
Cattaneo MD, Jansson M, Ma X (2020).
“Simple local polynomial density estimators.”
J. Amer. Statist. Assoc., 115(531), 1449–1455.
See Also
Supported methods: coef.lpcde
, confint.lpcde
,
plot.lpcde
, print.lpcde
,
summary.lpcde
, vcov.lpcde
Examples
#Density estimation example
n=500
x_data = matrix(rnorm(n, mean=0, sd=1))
y_data = matrix(rnorm(n, mean=x_data, sd=1))
y_grid = seq(from=-1, to=1, length.out=5)
model1 = lpcde::lpcde(x_data=x_data, y_data=y_data, y_grid=y_grid, x=0, bw=0.5)
#summary of estimation
summary(model1)