cv.select {SCBmeanfd} | R Documentation |
Cross-Validation Bandwidth Selection for Local Polynomial Estimation
Description
Select the cross-validation bandwidth described in Rice and Silverman (1991) for the local polynomial estimation of a mean function based on functional data.
Usage
cv.select(x, y, degree = 1, interval = NULL, gridsize = length(x), ...)
Arguments
x |
observation points. Missing values are not accepted. |
y |
matrix or data frame with functional observations (= curves) stored in rows. The number of columns of |
degree |
degree of the local polynomial fit. |
interval |
lower and upper bounds of the search interval (numeric vector of length 2). |
gridsize |
size of evaluation grid for the smoothed data. |
... |
additional arguments to pass to the optimization function |
Details
The cross-validation score is obtained by leaving in turn each curve out and computing the prediction error of the local polynomial smoother based on all other curves. For a bandwith value h
, this score is
CV(h) = \frac{1}{np} \sum_{i=1}^n \sum_{j=1}^p \left( Y_{ij} - \hat{\mu}^{-(i)}(x_j;h) \right)^2,
where Y_{ij}
is the measurement of the i
-th curve at location x_j
for i=1,\ldots,n
and j=1,\ldots,p
, and \hat{\mu}^{-(i)}(x_j;h)
is the local polynomial estimator with bandwidth h
based on all curves except the i
-th.
If the x
values are not equally spaced, the data are first smoothed and evaluated on a grid of length gridsize
spanning the range of x
. The smoothed data are then interpolated back to x
.
cv.select
uses the standard R function optimize
to optimize cv.score
. If the argument interval
is not specified, the lower bound of the search interval is by default (x_2-x_1)/2
if degree < 2
and x_2-x_1
if degree >= 2
. The default value of the upper bound is (\max(x)-\min(x))/2
. These values guarantee in most cases that the local polynomial estimator is well defined. It is often useful to plot the function to be optimized for a range of argument values (grid search) before applying a numerical optimizer. In this way, the search interval can be narrowed down and the optimizer is more likely to find a global solution.
Value
a bandwidth that minimizes the cross-validation score.
References
Rice, J. A. and Silverman, B. W. (1991). Estimating the mean and covariance structure nonparametrically when the data are curves. Journal of the Royal Statistical Society. Series B (Methodological), 53, 233–243.
See Also
Examples
## Not run:
## Plasma citrate data
## Compare cross-validation scores and bandwidths
## for local linear and local quadratic smoothing
data(plasma)
time <- 8:21
## Local linear smoothing
cv.select(time, plasma, 1) # local solution h = 3.76, S(h) = 463.08
cv.select(time, plasma, 1, interval = c(.5, 1)) # global solution = .75, S(h) = 439.54
## Local quadratic smoothing
cv.select(time, plasma, 2) # global solution h = 1.15, S(h) = 432.75
cv.select(time, plasma, 2, interval = c(1, 1.5)) # same
## End(Not run)