cv.score {SCBmeanfd} | R Documentation |
Leave-One-Curve-out Cross-Validation Score
Description
Compute the cross-validation score of Rice and Silverman (1991) for the local polynomial estimation of a mean function.
Usage
cv.score(bandwidth, x, y, degree = 1, gridsize = length(x))
Arguments
bandwidth |
kernel bandwidth. |
x |
observation points. Missing values are not accepted. |
y |
matrix or data frame with functional observations (= curves) stored in rows. The number of columns of |
degree |
degree of the local polynomial fit. |
gridsize |
size of evaluation grid for the smoothed data. |
Details
The cross-validation score is obtained by leaving in turn each curve out and computing the prediction error of the local polynomial smoother based on all other curves. For a bandwith value h
, this score is
CV(h) = \frac{1}{np} \sum_{i=1}^n \sum_{j=1}^p \left( Y_{ij} - \hat{\mu}^{-(i)}(x_j;h) \right)^2,
where Y_{ij}
is the measurement of the i
-th curve at location x_j
for i=1,\ldots,n
and j=1,\ldots,p
, and \hat{\mu}^{-(i)}(x_j;h)
is the local polynomial estimator with bandwidth h
based on all curves except the i
-th.
If the x
values are not equally spaced, the data are first smoothed and evaluated on a grid of length gridsize
spanning the range of x
. The smoothed data are then interpolated back to x
.
Value
the cross-validation score.
References
Rice, J. A. and Silverman, B. W. (1991). Estimating the mean and covariance structure nonparametrically when the data are curves. Journal of the Royal Statistical Society. Series B (Methodological), 53, 233–243.
See Also
Examples
## Artificial example
x <- seq(0, 1, len = 100)
mu <- x + .2 * sin(2 * pi * x)
y <- matrix(mu + rnorm(2000, sd = .25), 20, 100, byrow = TRUE)
h <- c(.005, .01, .02, .05, .1, .15)
cv <- numeric()
for (i in 1:length(h)) cv[i] <- cv.score(h[i], x, y, 1)
plot(h, cv, type = "l")
## Plasma citrate data
## Compare cross-validation scores and bandwidths
## for local linear and local quadratic smoothing
## Not run:
data(plasma)
time <- 8:21
h1 <- seq(.5, 1.3, .05)
h2 <- seq(.75, 2, .05)
cv1 <- sapply(h1, cv.score, x = time, y = plasma, degree = 1)
cv2 <- sapply(h2, cv.score, x = time, y = plasma, degree = 2)
plot(h1, cv1, type = "l", xlim = range(c(h1,h2)), ylim = range(c(cv1, cv2)),
xlab = "Bandwidth (hour)", ylab = "CV score",
main = "Cross validation for local polynomial estimation")
lines(h2, cv2, col = 2)
legend("topleft", legend = c("Linear", "Quadratic"), lty = 1,
col = 1:2, cex = .9)
## Note: using local linear (resp. quadratic) smoothing
## with a bandwidth smaller than .5 (resp. .75) can result
## in non-definiteness or numerical instability of the estimator.
## End(Not run)