mgsim.cv {plsgenomics} | R Documentation |
Determination of the ridge regularization parameter and the bandwidth to be used for classification with GSIM for categorical data
Description
The function mgsim.cv
determines the best ridge regularization parameter and bandwidth to
be used for classification with MGSIM as described in Lambert-Lacroix and Peyre (2005).
Usage
mgsim.cv(Ytrain,Xtrain,LambdaRange,hRange,NbIterMax=50)
Arguments
Xtrain |
a (ntrain x p) data matrix of predictors. |
Ytrain |
a ntrain vector of responses. |
LambdaRange |
the vector of positive real value from which the best ridge regularization parameter has to be chosen by cross-validation. |
hRange |
the vector of strictly positive real value from which the best bandwidth has to be chosen by cross-validation. |
NbIterMax |
a positive integer. |
Details
The cross-validation procedure described in Lambert-Lacroix and Peyre (2005)
is used to determine the best ridge regularization parameter and bandwidth to be
used for classification with GSIM for categorical data (for binary data see
gsim
and gsim.cv
).
At each cross-validation run, Xtrain
is split into a pseudo training
set (ntrain-1 samples) and a pseudo test set (1 sample) and the
classification error rate is determined for each
value of ridge regularization parameter and bandwidth. Finally, the function
mgsim.cv
returns the values of the ridge regularization parameter and
bandwidth for which the mean classification error rate is minimal.
Value
A list with the following components:
Lambda |
the optimal regularization parameter. |
h |
the optimal bandwidth parameter. |
Author(s)
Sophie Lambert-Lacroix (http://membres-timc.imag.fr/Sophie.Lambert/) and Julie Peyre (https://membres-ljk.imag.fr/Julie.Peyre/).
References
S. Lambert-Lacroix, J. Peyre . (2006) Local likelyhood regression in generalized linear single-index models with applications to microarrays data. Computational Statistics and Data Analysis, vol 51, n 3, 2091-2113.
See Also
Examples
## Not run:
## between 5~15 seconds
# load plsgenomics library
library(plsgenomics)
# load SRBCT data
data(SRBCT)
IndexLearn <- c(sample(which(SRBCT$Y==1),10),sample(which(SRBCT$Y==2),4),
sample(which(SRBCT$Y==3),7),sample(which(SRBCT$Y==4),9))
### Determine optimum h and lambda
# /!\ take 30 secondes to run
#hl <- mgsim.cv(Ytrain=SRBCT$Y[IndexLearn],Xtrain=SRBCT$X[IndexLearn,],
# LambdaRange=c(0.1),hRange=c(7,20))
### perform prediction by MGSIM
#res <- mgsim(Ytrain=SRBCT$Y[IndexLearn],Xtrain=SRBCT$X[IndexLearn,],Lambda=hl$Lambda,
# h=hl$h,Xtest=SRBCT$X[-IndexLearn,])
#res$Cvg
#sum(res$Ytest!=SRBCT$Y[-IndexLearn])
## End(Not run)