lpcde {lpcde}R Documentation

Local polynomial conditional density estimation

Description

lpcde implements the local polynomial regression based conditional density (and derivatives). The estimator proposed in (Cattaneo et al. 2024). Robust bias-corrected inference methods, both pointwise (confidence intervals) and uniform (confidence bands), are also implemented.

Usage

lpcde(
  x_data,
  y_data,
  y_grid = NULL,
  x = NULL,
  bw = NULL,
  p = NULL,
  q = NULL,
  p_RBC = NULL,
  q_RBC = NULL,
  mu = NULL,
  nu = NULL,
  rbc = TRUE,
  ng = NULL,
  normalize = FALSE,
  nonneg = FALSE,
  grid_spacing = "",
  kernel_type = c("epanechnikov", "triangular", "uniform"),
  bw_type = NULL
)

Arguments

x_data

Numeric matrix/data frame, the raw data of covariates.

y_data

Numeric matrix/data frame, the raw data of independent.

y_grid

Numeric, specifies the grid of evaluation points in the y-direction. When set to default, grid points will be chosen as 0.05-0.95 percentiles of the data, with a step size of 0.05 in y-direction.

x

Numeric, specifies the grid of evaluation points in the x-direction. When set to default, the evaluation point will be chosen as the median of the x data.

bw

Numeric, specifies the bandwidth used for estimation. Can be (1) a positive scalar (common bandwidth for all grid points); or (2) a positive numeric vector/matrix specifying bandwidths for each grid point (should be the same dimension as grid).

p

Nonnegative integer, specifies the order of the local polynomial for Y used to construct point estimates. (Default is 2.)

q

Nonnegative integer, specifies the order of the local polynomial for X used to construct point estimates. (Default is 1.)

p_RBC

Nonnegative integer, specifies the order of the local polynomial for Y used to construct bias-corrected point estimates. (Default is p+1.)

q_RBC

Nonnegative integer, specifies the order of the local polynomial for X used to construct bias-corrected point estimates. (Default is q+1.)

mu

Nonnegative integer, specifies the derivative with respect to Y of the distribution function to be estimated. 0 for the distribution function, 1 (default) for the density funtion, etc.

nu

Nonnegative integer, specifies the derivative with respect to X of the distribution function to be estimated. Default value is 0.

rbc

Boolean. TRUE (default) for rbc calcuations, required for valid uniform inference.

ng

Int, number of grid points to be used. generates evenly space points over the support of the data.

normalize

Boolean, False (default) returns original estimator, True normalizes estimates to integrate to 1.

nonneg

Boolean, False (default) returns original estimator, True returns maximum of estimate and 0.

grid_spacing

String, If equal to "quantile" will generate quantile-spaced grid evaluation points, otherwise will generate equally spaced points.

kernel_type

String, specifies the kernel function, should be one of "triangular", "uniform", and "epanechnikov"(default).

bw_type

String, specifies the method for data-driven bandwidth selection. This option will be ignored if bw is provided. Implementable with "mse-dpi" (default, mean squared error-optimal bandwidth selected for each grid point)

Details

Bias correction is only used for the construction of confidence intervals/bands, but not for point estimation. The point estimates, denoted by est, are constructed using local polynomial estimates of order p and q, while the centering of the confidence intervals/bands, denoted by est_RBC, are constructed using local polynomial estimates of order p_RBC and q_RBC. The confidence intervals/bands take the form: [est_RBC - cv * SE(est_RBC) , est_RBC + cv * SE(est_RBC)], where cv denotes the appropriate critical value and SE(est_RBC) denotes an standard error estimate for the centering of the confidence interval/band. As a result, the confidence intervals/bands may not be centered at the point estimates because they have been bias-corrected. Setting p_RBC equal to p and q_RBC to q, results on centered at the point estimate confidence intervals/bands, but requires undersmoothing for valid inference (i.e., (I)MSE-optimal bandwdith for the density point estimator cannot be used). Hence the bandwidth would need to be specified manually when q=p, and the point estimates will not be (I)MSE optimal. See Cattaneo, Jansson and Ma (2020a, 2020b) for details, and also Calonico, Cattaneo, and Farrell (2018, 2020) for robust bias correction methods.

Sometimes the density point estimates may lie outside of the confidence intervals/bands, which can happen if the underlying distribution exhibits high curvature at some evaluation point(s). One possible solution in this case is to increase the polynomial order p or to employ a smaller bandwidth.

Value

Estimate

A matrix containing (1) grid (grid points),
(2) bw (bandwidths),
(3) est (point estimates with p-th and q-th order local polynomial),
(4) est_RBC (point estimates with p_RBC-th and q_RBC-th order local polynomial),
(5) se (standard error corresponding to est). (6) se_RBC (standard error corresponding to est_RBC).

CovMat

The variance-covariance matrix corresponding to est.

opt

A list containing options passed to the function.

Author(s)

Matias D. Cattaneo, Princeton University. cattaneo@princeton.edu.

Rajita Chandak (maintainer), Princeton University. rchandak@princeton.edu.

Michael Jansson, University of California Berkeley. mjansson@econ.berkeley.edu.

Xinwei Ma, University of California San Diego. x1ma@ucsd.edu.

References

Cattaneo MD, Chandak R, Jansson M, Ma X (2024). “Local Polynomial Conditional Density Estimators.” Bernoulli.
Calonico S, Cattaneo MD, Farrell MH (2018). “On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference.” Journal of the American Statistical Association, 113(522), 767–779.
Calonico S, Cattaneo MD, Farrell MH (2022). “Coverage Error Optimal Confidence Intervals for Local Polynomial Regression.” Bernoulli, 28(4), 2998–3022.
Cattaneo MD, Jansson M, Ma X (2020). “Simple local polynomial density estimators.” J. Amer. Statist. Assoc., 115(531), 1449–1455.

See Also

Supported methods: coef.lpcde, confint.lpcde, plot.lpcde, print.lpcde, summary.lpcde, vcov.lpcde

Examples

#Density estimation example
n=500
x_data = matrix(rnorm(n, mean=0, sd=1))
y_data = matrix(rnorm(n, mean=x_data, sd=1))
y_grid = seq(from=-1, to=1, length.out=5)
model1 = lpcde::lpcde(x_data=x_data, y_data=y_data, y_grid=y_grid, x=0, bw=0.5)
#summary of estimation
summary(model1)


[Package lpcde version 0.1.4 Index]