R: Cross validation, n-fold and leave-one-out for support vector...

svmcv {spm2}

R Documentation

Cross validation, n-fold and leave-one-out for support vector machine ('svm')

Description

This function is a cross validation function for 'svm' regression in 'e1071' package.

Usage

svmcv(
  formula = NULL,
  trainxy,
  y,
  scale = TRUE,
  type = NULL,
  kernel = "radial",
  degree = 3,
  gamma = if (is.vector(trainxy)) 1 else 1/ncol(trainxy),
  coef0 = 0,
  cost = 1,
  nu = 0.5,
  tolerance = 0.001,
  epsilon = 0.1,
  validation = "CV",
  cv.fold = 10,
  predacc = "VEcv",
  ...
)

Arguments

`formula`	a formula defining the response variable and predictive variables.
`trainxy`	a dataframe contains predictive variables and the response variable of point samples. The location information, longitude (long), latitude (lat), need to be included in the 'trainx' for spatial predictive modelling, need to be named as 'long' and 'lat'.
`y`	a vector of the response variable in the formula, that is, the left part of the formula.
`scale`	A logical vector indicating the variables to be scaled (default: TRUE).
`type`	the default setting is 'NULL'. See '?svm' for various options.
`kernel`	the default setting is 'radial'. See '?svm' for other options.
`degree`	a parameter needed for kernel of type polynomial (default: 3).
`gamma`	a parameter needed for all 'kernels' except 'linear' (default: 1/(data dimension)).
`coef0`	a parameter needed for kernels of type 'polynomial' and 'sigmoid'(default: 0).
`cost`	cost of constraints violation (default: 1).
`nu`	a parameter needed for 'nu-classification', 'nu-regression', and 'one-classification' (default: 0.5).
`tolerance`	tolerance of termination criterion (default: 0.001).
`epsilon`	'epsilon' in the insensitive-loss function (default: 0.1).
`validation`	validation methods, include 'LOO': leave-one-out, and 'CV': cross-validation.
`cv.fold`	integer; number of folds in the cross-validation. if > 1, then apply n-fold cross validation; the default is 10, i.e., 10-fold cross validation that is recommended.
`predacc`	can be either "VEcv" for 'vecv' or "ALL" for all measures in function pred.acc.
`...`	other arguments passed on to 'svm'.

Value

A list with the following components: me, rme, mae, rmae, mse, rmse, rrmse, vecv and e1; or vecv only

Note

This function is largely based on 'rfcv' in 'randomForest' and 'svm' in 'e1071'.

Author(s)

Jin Li

References

David Meyer, Evgenia Dimitriadou, Kurt Hornik, Andreas Weingessel and Friedrich Leisch (2020). e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien. R package version 1.7-4. https://CRAN.R-project.org/package=e1071.

Examples


library(spm)

data(petrel)
gravel <- petrel[, c(1, 2, 6:9, 5)]
model <- log(gravel + 1) ~ lat +  bathy + I(long^3) + I(lat^2) + I(lat^3)
set.seed(1234)
svmcv1 <- svmcv(formula = model, gravel, log(gravel[, 7] +1), validation = "CV",
 predacc = "ALL")
svmcv1

data(sponge2)
model <- species.richness ~ .
set.seed(1234)
svmcv1 <- svmcv(formula = model, sponge2[, -4], sponge[, 3], gamma = 0.01, cost = 3.5,
scale = TRUE, validation = "CV",  predacc = "VEcv")
svmcv1

# For svm
model <- species.richness ~ .
set.seed(1234)
n <- 20 # number of iterations,60 to 100 is recommended.
VEcv <- NULL
for (i in 1:n) {
svmcv1 <- svmcv(formula = model, sponge2[, -4], sponge[, 3], gamma = 0.01, cost = 3.5,
scale = TRUE, validation = "CV",  predacc = "VEcv")
VEcv [i] <- svmcv1
}
plot(VEcv ~ c(1:n), xlab = "Iteration for svm", ylab = "VEcv (%)")
points(cumsum(VEcv) / c(1:n) ~ c(1:n), col = 2)
abline(h = mean(VEcv), col = 'blue', lwd = 2)

[Package spm2 version 1.1.3 Index]