lokerns {lokern} | R Documentation |
Kernel Regression Smoothing with Local Plug-in Bandwidth
Description
Nonparametric estimation of regression functions and their derivatives with kernel regression estimators and automatically adapted local plug-in bandwidth function.
Usage
lokerns(x, ...)
## Default S3 method:
lokerns(x, y=NULL, deriv = 0, n.out=300, x.out=NULL, x.inOut = TRUE,
korder = deriv + 2, hetero=FALSE, is.rand=TRUE,
inputb = is.numeric(bandwidth) && all(bandwidth > 0),
m1 = 400, xl=NULL, xu=NULL,
s=NULL, sig=NULL, bandwidth=NULL, trace.lev = 0, ...)
## S3 method for class 'formula'
lokerns(formula, data, subset, na.action, ...)
Arguments
x |
vector of design points, not necessarily ordered. |
y |
vector of observations of the same length as |
deriv |
order of derivative of the regression function to be
estimated. Only |
n.out |
number of output design points where the function has to
be estimated; default is |
x.out |
vector of output design points where the function has to be estimated. The default is an equidistant grid of n.out points from min(x) to max(x). |
x.inOut |
logical or character string indicating if In order for |
korder |
nonnegative integer giving the kernel order |
hetero |
logical: if TRUE, heteroscedastic error variables are assumed for variance estimation, if FALSE the variance estimation is optimized for homoscedasticity. Default value is hetero=FALSE. |
is.rand |
logical: if |
inputb |
logical: if true, a local input bandwidth array is used; if
|
m1 |
integer, the number of grid points for integral approximation when estimating the plug-in bandwidth. The default, 400, may be increased if a very large number of observations are available. |
xl , xu |
numeric (scalars), the lower and upper bounds for integral
approximation and variance estimation when estimating the plug-in
bandwidth. By default (when |
s |
s-array of the convolution kernel estimator. If it is not given by input
it is calculated as midpoint-sequence of the ordered design points for
|
sig |
variance of the error variables. If it is not given by
input or if |
bandwidth |
local bandwidth array for kernel regression estimation. If it is
not given by input or if |
trace.lev |
integer indicating how much the internal (Fortran
level) computations should be “traced”, i.e., be reported.
The default, |
formula |
a |
data |
an optional matrix or data frame (or similar: see
|
subset |
an optional vector specifying a subset of observations to be used. |
na.action |
a function which indicates what should happen when
the data contain |
... |
for the |
Details
This function calls an efficient and fast algorithm for automatically adaptive nonparametric regression estimation with a kernel method.
Roughly spoken, the method performs a local averaging of the observations when estimating the regression function. Analogously, one can estimate derivatives of small order of the regression function. Crucial for the kernel regression estimation used here is the choice the local bandwidth array. Too small bandwidths will lead to a wiggly curve, too large ones will smooth away important details. The function lokerns calculates an estimator of the regression function or derivatives of the regression function with an automatically chosen local plugin bandwidth function. It is also possible to use a local bandwidth array which are specified by the user.
Main ideas of the plugin method are to estimate the optimal bandwidths
by estimating the asymptotically optimal mean squared error optimal
bandwidths. Therefore, one has to estimate the variance for
homoscedastic error variables and a functional of a smooth variance
function for heteroscedastic error variables, respectively. Also, one
has to estimate an integral functional of the squared k
-th derivative
of the regression function (k=\code{korder}
) for the global
bandwidth and the squared k
-th derivative itself for the local
bandwidths.
Some more details are in glkerns
.
Value
an object of class(es) c("lokerns", "KernS")
, which is
a list including used parameters and estimator, containing among others
x |
vector of ordered design points. |
y |
vector of observations ordered with respect to x. |
bandwidth |
local bandwidth array which was used for kernel regression estimation. |
x.out |
vector of ordered output design points. |
est |
vector of estimated regression function or its derivative
(at |
sig |
variance estimation which was used for calculating the plug-in bandwidths if hetero=TRUE (default) and either inputb=FALSE (default) or is.rand=TRUE (default). |
deriv |
derivative of the regression function which was estimated. |
korder |
order of the kernel function which was used. |
xl |
lower bound for integral approximation and variance estimation. |
xu |
upper bound for integral approximation and variance estimation. |
s |
vector of midpoint values used for the convolution kernel regression estimator. |
References
All the references in glkerns
.
See Also
glkerns
for global bandwidth computation.
plot.KernS
documents all the methods for "KernS"
classed objects.
Examples
data(cars)
lofit <- lokerns(dist ~ speed, data=cars)
lofit # print() method
if(require("sfsmisc")) {
TA.plot(lofit)
} else { plot(residuals(lofit) ~ fitted(lofit)); abline(h = 0, lty=2) }
qqnorm(residuals(lofit), ylab = "residuals(lofit)")
## nice simple plot of data + smooth
plot(lofit)
(sb <- summary(lofit$bandwidth))
op <- par(fg = "gray90", tcl = -0.2, mgp = c(3,.5,0))
plot(lofit$band, ylim=c(0,3*sb["Max."]), type="h",#col="gray90",
ann = FALSE, axes = FALSE)
boxplot(lofit$bandwidth, add = TRUE, at = 304, boxwex = 8,
col = "gray90",border="gray", pars = list(axes = FALSE))
axis(4, at = c(0,pretty(sb)), col.axis = "gray")
par(op)
par(new=TRUE)
plot(dist ~ speed, data = cars,
main = "Local Plug-In Bandwidth Vector")
lines(lofit, col=4, lwd=2)
mtext(paste("bandwidth in [",
paste(format(sb[c(1,6)], dig = 3),collapse=","),
"]; Median b.w.=",formatC(sb["Median"])))
## using user-specified bandwidth array
myBW <- round(2*lofit$bandwidth, 2)
(lofB <- lokerns(dist ~ speed, data=cars, bandwidth = myBW)) # failed (for a while)
## can use deriv=3 (and 4) here:
lofB3 <- lokerns(dist ~ speed, data=cars, bandwidth = myBW, deriv=3)
plot(lofB)
lines(lofB3, col=3)
stopifnot(inherits(lofB3, "KernS"), identical(lofB3$korder, 5L))