crossval {bootstrap}R Documentation

K-fold Cross-Validation

Description

See Efron and Tibshirani (1993) for details on this function.

Usage

   crossval(x, y, theta.fit, theta.predict, ..., ngroup=n)

Arguments

x

a matrix containing the predictor (regressor) values. Each row corresponds to an observation.

y

a vector containing the response values

theta.fit

function to be cross-validated. Takes x and y as an argument. See example below.

theta.predict

function producing predicted values for theta.fit. Arguments are a matrix x of predictors and fit object produced by theta.fit. See example below.

...

any additional arguments to be passed to theta.fit

ngroup

optional argument specifying the number of groups formed . Default is ngroup=sample size, corresponding to leave-one out cross-validation.

Value

list with the following components

cv.fit

The cross-validated fit for each observation. The numbers 1 to n (the sample size) are partitioned into ngroup mutually disjoint groups of size "leave.out". leave.out, the number of observations in each group, is the integer part of n/ngroup. The groups are chosen at random if ngroup < n. (If n/leave.out is not an integer, the last group will contain > leave.out observations). Then theta.fit is applied with the kth group of observations deleted, for k=1, 2, ngroup. Finally, the fitted value is computed for the kth group using theta.predict.

ngroup

The number of groups

leave.out

The number of observations in each group

groups

A list of length ngroup containing the indices of the observations in each group. Only returned if leave.out > 1.

call

The deparsed call

References

Stone, M. (1974). Cross-validation choice and assessment of statistical predictions. Journal of the Royal Statistical Society, B-36, 111–147.

Efron, B. and Tibshirani, R. (1993) An Introduction to the Bootstrap. Chapman and Hall, New York, London.

Examples

# cross-validation of least squares regression
# note that crossval is not very efficient, and being a
#  general purpose function, it does not use the
# Sherman-Morrison identity for this special case
   x <- rnorm(85)  
   y <- 2*x +.5*rnorm(85)                      
   theta.fit <- function(x,y){lsfit(x,y)}
   theta.predict <- function(fit,x){
               cbind(1,x)%*%fit$coef         
               }                       
   results <- crossval(x,y,theta.fit,theta.predict,ngroup=6)  
                                      

[Package bootstrap version 2019.6 Index]