glskrigecv {spm2}R Documentation

Cross validation, n-fold and leave-one-out for the hybrid method of generalized least squares ('gls') and kriging ('krige') ('glskrige')

Description

This function is a cross validation function for the hybrid method of 'gls' and 'krige' ('glskrige'), where the data splitting is based on a stratified random sampling method (see the 'datasplit' function for details)

Usage

glskrigecv(
  model = var1 ~ 1,
  longlat,
  trainxy,
  y,
  corr.args = NULL,
  weights = NULL,
  transformation = "none",
  delta = 1,
  formula.krige = res1 ~ 1,
  vgm.args = c("Sph"),
  anis = c(0, 1),
  alpha = 0,
  block = 0,
  beta,
  nmaxkrige = 12,
  validation = "CV",
  cv.fold = 10,
  predacc = "VEcv",
  ...
)

Arguments

model

a formula defining the response variable and predictive variables.

longlat

a dataframe contains longitude and latitude of point samples.

trainxy

a dataframe contains longitude (long), latitude (lat), predictive variables and the response variable of point samples. That is, the location information must be names as 'long' and 'lat'.

y

a vector of the response variable in the formula, that is, the left part of the formula.

corr.args

arguments for 'correlation' in 'gls'. See '?corClasses' in 'nlme' for details. By default, "NULL" is used. When "NULL" is used, then 'gls' is actually performing 'lm'.

weights

describing the within-group heteroscedasticity structure. Defaults to "NULL", corresponding to homoscedastic errors. See '?gls' in 'nlme' for details.

transformation

transform the residuals of 'gls' to normalize the data; can be "sqrt" for square root, "arcsine" for arcsine, "log" or "none" for non transformation. By default, "none" is used.

delta

numeric; to avoid log(0) in the log transformation. The default is 1.

formula.krige

formula defining the response vector and (possible) regressor. an object (i.e., 'variogram.formula') for 'variogram' or a formula for 'krige'. see 'variogram' and 'krige' in 'gstat' for details.

vgm.args

arguments for 'vgm', e.g. variogram model of response variable and anisotropy parameters. see 'vgm' in 'gstat' for details. By default, "Sph" is used.

anis

anisotropy parameters: see notes 'vgm' in 'gstat' for details.

alpha

direction in plane (x,y). see variogram in 'gstat' for details.

block

block size. see 'krige' in 'gstat' for details.

beta

for simple kriging. see 'krige' in 'gstat' for details.

nmaxkrige

for a local predicting: the number of nearest observations that should be used for a prediction or simulation, where nearest is defined in terms of the space of the spatial locations. By default, 12 observations are used.

validation

validation methods, include 'LOO': leave-one-out, and 'CV': cross-validation.

cv.fold

integer; number of folds in the cross-validation. if > 1, then apply n-fold cross validation; the default is 10, i.e., 10-fold cross validation that is recommended.

predacc

can be either "VEcv" for vecv or "ALL" for all measures in function pred.acc.

...

other arguments passed on to 'gls' and 'krige'.

Value

A list with the following components: me, rme, mae, rmae, mse, rmse, rrmse, vecv and e1; or vecv only.

Note

This function is largely based on rfcv in 'randomForest', 'krigecv' in 'spm2' and 'gls' in 'library(nlme)'.

Author(s)

Jin Li

References

Pinheiro, J. C. and D. M. Bates (2000). Mixed-Effects Models in S and S-PLUS. New York, Springer.

Pebesma, E.J., 2004. Multivariable geostatistics in S: the gstat package. Computers & Geosciences, 30: 683-691.

Examples


library(spm)
library(nlme)

data(petrel)
gravel <- petrel[, c(1, 2, 6:9, 5)]
longlat <- petrel[, c(1, 2)]
range1 <- 0.8
nugget1 <- 0.5
model <- log(gravel + 1) ~  long + lat +  bathy + dist + I(long^2) + I(lat^2) +
I(lat^3) + I(bathy^2) + I(bathy^3) + I(dist^2) + I(dist^3) + I(relief^2) + I(relief^3)

glskrigecv1 <- glskrigecv(model = model, longlat = longlat, trainxy = gravel,
y = log(gravel[, 7] +1), transformation = "none", formula.krige = res1 ~ 1,
vgm.args = "Sph", nmaxkrige = 12, validation = "CV",
 corr.args = corSpher(c(range1, nugget1), form = ~ lat + long, nugget = TRUE),
 predacc = "ALL")
glskrigecv1

# For glskrige
set.seed(1234)
n <- 20 # number of iterations,60 to 100 is recommended.
VEcv <- NULL
for (i in 1:n) {
glskrigecv1 <- glskrigecv(model = model, longlat = longlat, trainxy = gravel,
y = log(gravel[, 7] +1), transformation = "none", formula.krige = res1 ~ 1,
vgm.args = "Sph", nmaxok = 12, validation = "CV",
corr.args = corSpher(c(range1, nugget1), form = ~ lat + long, nugget = TRUE),
 predacc = "VEcv")
VEcv [i] <- glskrigecv1
}
plot(VEcv ~ c(1:n), xlab = "Iteration for GLSOK", ylab = "VEcv (%)")
points(cumsum(VEcv) / c(1:n) ~ c(1:n), col = 2)
abline(h = mean(VEcv), col = 'blue', lwd = 2)



[Package spm2 version 1.1.3 Index]