glsidwcv {spm2} | R Documentation |
Cross validation, n-fold and leave-one-out for the hybrid method of generalized least squares ('gls') and inverse distance weighted ('idw') (glsidw)
Description
This function is a cross validation function for the hybrid method of 'gls' and 'idw', where the data splitting is based on a stratified random sampling method (see the 'datasplit' function for details)
Usage
glsidwcv(
model = var1 ~ 1,
longlat,
trainxy,
y,
corr.args = NULL,
weights = NULL,
idp = 2,
nmaxidw = 12,
validation = "CV",
cv.fold = 10,
predacc = "VEcv",
...
)
Arguments
model |
a formula defining the response variable and predictive variables. |
longlat |
a dataframe contains longitude and latitude of point samples. |
trainxy |
a dataframe contains longitude (long), latitude (lat), predictive variables and the response variable of point samples. That is, the location information must be names as 'long' and 'lat'. |
y |
a vector of the response variable in the formula, that is, the left part of the formula. |
corr.args |
arguments for 'correlation' in 'gls'. See '?corClasses' in 'nlme' for details. By default, "NULL" is used. When "NULL" is used, then 'gls' is actually performing 'lm'. |
weights |
describing the within-group heteroscedasticity structure. Defaults to "NULL", corresponding to homoscedastic errors. See '?gls' in 'nlme' for details. |
idp |
a numeric number specifying the inverse distance weighting power. |
nmaxidw |
for a local predicting: the number of nearest observations that should be used for a prediction or simulation, where nearest is defined in terms of the space of the spatial locations. By default, 12 observations are used. |
validation |
validation methods, include 'LOO': leave-one-out, and 'CV': cross-validation. |
cv.fold |
integer; number of folds in the cross-validation. if > 1, then apply n-fold cross validation; the default is 10, i.e., 10-fold cross validation that is recommended. |
predacc |
can be either "VEcv" for vecv or "ALL" for all measures in function pred.acc. |
... |
other arguments passed on to 'gls' and 'gstat'. |
Value
A list with the following components: me, rme, mae, rmae, mse, rmse, rrmse, vecv and e1; or vecv only.
Note
This function is largely based on rfcv in 'randomForest' and 'gls' in 'library(nlme)'.
Author(s)
Jin Li
References
Pinheiro, J. C. and D. M. Bates (2000). Mixed-Effects Models in S and S-PLUS. New York, Springer.
Pebesma, E.J., 2004. Multivariable geostatistics in S: the gstat package. Computers & Geosciences, 30: 683-691.
Examples
library(spm)
library(nlme)
data(petrel)
gravel <- petrel[, c(1, 2, 6:9, 5)]
longlat <- petrel[, c(1, 2)]
range1 <- 0.8
nugget1 <- 0.5
model <- log(gravel + 1) ~ long + lat + bathy + dist + I(long^2) + I(lat^2) +
I(lat^3) + I(bathy^2) + I(bathy^3) + I(dist^2) + I(dist^3) + I(relief^2) + I(relief^3)
glsidwcv1 <- glsidwcv(model = model, longlat = longlat, trainxy = gravel,
y = log(gravel[, 7] +1), idp = 2, nmaxidw = 12, validation = "CV",
corr.args = corSpher(c(range1, nugget1), form = ~ lat + long, nugget = TRUE),
predacc = "ALL")
glsidwcv1
# For glsidw
set.seed(1234)
n <- 20 # number of iterations,60 to 100 is recommended.
VEcv <- NULL
for (i in 1:n) {
glsidwcv1 <- glsidwcv(model = model, longlat = longlat, trainxy = gravel,
y = log(gravel[, 7] +1), idp = 2, nmaxidw = 12, validation = "CV",
corr.args = corSpher(c(range1, nugget1), form = ~ lat + long, nugget = TRUE),
predacc = "VEcv")
VEcv [i] <- glsidwcv1
}
plot(VEcv ~ c(1:n), xlab = "Iteration for GLSIDW", ylab = "VEcv (%)")
points(cumsum(VEcv) / c(1:n) ~ c(1:n), col = 2)
abline(h = mean(VEcv), col = 'blue', lwd = 2)