crossValidation {DiceEval} | R Documentation |
K-fold Cross Validation
Description
This function calculates the predicted values at each point of the design and gives an estimation of criterion using K-fold cross-validation.
Usage
crossValidation(model, K)
Arguments
model |
an output of the |
K |
the number of groups into which the data should be split to apply cross-validation |
Value
A list with the following components:
Ypred |
a vector of predicted values obtained using K-fold cross-validation at the points of the design |
Q2 |
a real which is the estimation of the criterion |
folds |
a list which indicates the partitioning of the data into the folds |
RMSE_CV |
|
MAE_CV |
|
In the case of a Kriging model, other components to test the robustess of the procedure are proposed:
theta |
the range parameter theta estimated for each fold, |
trend |
the trend parameter estimated for each fold, |
shape |
the estimated shape parameter if the covariance structure is of type |
The principle of cross-validation is to split the data into K
folds of approximately equal size A_{1}{A1}, ..., A_{K}{AK}
. For k=1
to K
, a model \hat{Y}^{(-k)}
is fitted from the data \cup_{j \neq k} A_{k}
and this model is validated on the fold A_{k}
. Given a criterion of quality L
(here, L
could be the RMSE
or the MAE
criterion), the "evaluation" of the model consists in computing :
L_{k} = \frac{1}{n/K} \sum_{i \in A_{k}} L \left( y_{i}, Y^{(-k)} (x_{i} )\right).
The cross-validation criterion is the mean of the K
criterion: L
_CV=\frac{1}{K} \sum_{k=1}^{K} L_{k}.
The Q2
criterion is defined as: Q2
=\code{R2}(\code{Y},\code{Ypred})
with Y
the response value and Ypred
the value fit by cross-validation.
Note
When K
is equal to the number of observations, leave-one-out cross-validation
is performed.
Author(s)
D. Dupuy
See Also
R2
, modelFit
, MAE
, RMSE
, foldsComposition
, testCrossValidation
Examples
## Not run:
rm(list=ls())
# A 2D example
Branin <- function(x1,x2) {
x1 <- x1*15-5
x2 <- x2*15
(x2 - 5/(4*pi^2)*(x1^2) + 5/pi*x1 - 6)^2 + 10*(1 - 1/(8*pi))*cos(x1) + 10
}
# Linear model on 50 points
n <- 50
X <- matrix(runif(n*2),ncol=2,nrow=n)
Y <- Branin(X[,1],X[,2])
modLm <- modelFit(X,Y,type = "Linear",formula=Y~X1+X2+X1:X2+I(X1^2)+I(X2^2))
R2(Y,modLm$model$fitted.values)
crossValidation(modLm,K=10)$Q2
# kriging model : gaussian covariance structure, no trend, no nugget effect
# on 16 points
n <- 16
X <- data.frame(x1=runif(n),x2=runif(n))
Y <- Branin(X[,1],X[,2])
mKm <- modelFit(X,Y,type="Kriging",formula=~1, covtype="powexp")
K <- 10
out <- crossValidation(mKm, K)
par(mfrow=c(2,2))
plot(c(0,1:K),c(mKm$model@covariance@range.val[1],out$theta[,1]),
xlab='',ylab='Theta1')
plot(c(0,1:K),c(mKm$model@covariance@range.val[2],out$theta[,2]),
xlab='',ylab='Theta2')
plot(c(0,1:K),c(mKm$model@covariance@shape.val[1],out$shape[,1]),
xlab='',ylab='p1',ylim=c(0,2))
plot(c(0,1:K),c(mKm$model@covariance@shape.val[2],out$shape[,2]),
xlab='',ylab='p2',ylim=c(0,2))
par(mfrow=c(1,1))
## End(Not run)