rda.cv {rda} | R Documentation |
RDA Cross Validation Function
Description
A function that does RDA cross-validation analysis on the training data set.
Usage
rda.cv(fit, x, y, prior, alpha, delta, nfold=min(table(y), 10),
folds=balanced.folds(y), trace=FALSE)
Arguments
fit |
An |
x |
The training data set as used in the |
y |
The class labels of the training samples (columns) in "x" as
used in |
prior |
A numerical vector that gives the prior proportion of each
class. Its length should be equal to the number of classes. By default,
the function uses the one coming along with the |
alpha |
A numerical vector of the regularization values for alpha.
By default, the function uses the one coming along with the |
delta |
A numerical vector of the threshold values for delta.
By default, the function uses the one coming along with the |
nfold |
An integer number to specify the number of folds in
the cross-validation analysis. This option is overwritten when the
|
folds |
A list that provides the folds used in the cross-validation analysis. Each component of the list is an integer vector of the sample indices. See examples below for more details. |
trace |
A logical flag indicating whether the intermediate steps should be printed. |
Details
rda.cv
does the RDA-based cross-validation on the training data
set.
Value
The rda.cv
function will return an object of class rdacv
with the following list of components:
alpha |
The vector of the regularization values for alpha used in the cross-validation. |
delta |
The vector of the threshold values for delta used in the cross-validation. |
prior |
The vector of the prior proportion of each class used in the cross-validation. |
nfold |
The number of folds used in the cross-validation. |
folds |
The folds used in the cross-validation. |
yhat.new |
The 3-dim array of the predicted class labels of the training samples for each combination (alpha, delta). The first index corresponds to the alpha values while the second index corresponds to the delta values. The third index is the predicted class labels for the corresponding samples. |
err |
The training error matrix from cross-validation. The rows correspond to the alpha values while the columns correspond to the delta values. It is automatically generated by the function. |
cv.err |
The test error (or cross-validation error) matrix. The rows correspond to the alpha values while the columns correspond to the delta values. |
ngene |
The matrix of the number of shrunken genes. The rows correspond to the alpha values while the columns correspond to the delta values. Note: the number of shrunken genes is based on the average result from cross-validation. |
reg |
The type of regularization used in cross-validation. |
n |
The sample size of the training data set. |
Author(s)
Yaqian Guo, Trevor Hastie and Robert Tibshirani
References
Y. Guo, T. Hastie, R. Tibshirani, (2006). Regularized linear discriminant analysis and its application in microarrays, Biostatistics 8 pp. 86–100. doi:10.1093/biostatistics/kxj035.
See Also
Also see rda
and predict.rda
.
Examples
## These examples will take too long and R CMD check will complain ...
data(colon)
colon.x <- t(colon.x)
fit <- rda(colon.x, colon.y)
fit.cv <- rda.cv(fit, x=colon.x, y=colon.y)
## to use the customized folds in cross-validation,
## for example, 6-fold with 11, 11, 10, 10, 10, 10 samples
## in the respective folds, you can do the follows:
index <- sample(1:62, 62)
folds <- list()
folds[[1]] <- index[1:11]
folds[[2]] <- index[12:22]
folds[[3]] <- index[23:32]
folds[[4]] <- index[33:42]
folds[[5]] <- index[43:52]
folds[[6]] <- index[53:62]
fit.cv <- rda.cv(fit, colon.x, colon.y, folds=folds)