frankvali {mt} | R Documentation |
Estimates Feature Ranking Error Rate with Resampling
Description
Estimates error rate of feature ranking with resampling methods.
Usage
frankvali(dat, ...)
## Default S3 method:
frankvali(dat,cl,cl.method = "svm", fs.method="fs.auc",
fs.order=NULL, fs.len="power2", pars = valipars(),
tr.idx=NULL,...)
## S3 method for class 'formula'
frankvali(formula, data = NULL, ..., subset, na.action = na.omit)
fs.cl(dat,cl,fs.order=colnames(dat), fs.len=1:ncol(dat),
cl.method = "svm", pars = valipars(), all.fs=FALSE, ...)
fs.cl.1(dat,cl,fs.order=colnames(dat), cl.method = "svm",
pars = valipars(), agg_f=FALSE,...)
Arguments
formula |
A formula of the form |
data |
Data frame from which variables specified in |
dat |
A matrix or data frame containing the explanatory variables if no formula is given as the principal argument. |
cl |
A factor specifying the class for each observation if no formula principal argument is given. |
cl.method |
Classification method to be used. Any classification methods can be employed
if they have method |
fs.method |
Feature ranking method to be used. If |
fs.order |
A vector of ordered feature order. In |
fs.len |
Feature length used for validation. For details, see |
pars |
A list of resampling scheme method such as Cross-validation,
Stratified cross-validation, Leave-one-out cross-validation,
Randomised validation (holdout), Bootstrap, .632 bootstrap
and .632 plus bootstrap, and control parameters for the calculation of accuracy.
See |
tr.idx |
User defined index of training samples. Can be generated by |
all.fs |
A logical value indicating whether all features should be used for evaluation. |
agg_f |
A logical value indicating whether aggregated features should be used for evaluation. |
... |
Additional parameters to |
subset |
Optional vector, specifying a subset of observations to be used. |
na.action |
Function which indicates what should happen when the data
contains |
Details
These functions validate the selected feature subsets by classification and resampling methods.
It can take any classification model if its argument format
is model(formula, data, subset, ...)
and their corresponding
method predict.model(object, newdata, ...)
can either return the only
predicted class label or in a list with name as class
, such as
lda
and pcalda
.
The resampling method can be one of cv
, scv
, loocv
,
boot
, 632b
and 632pb
.
The feature ranking method can take one of fs.rf
, fs.auc
, fs.welch
,
fs.anova
, fs.bw
, fs.snr
, fs.kruskal
, fs.relief
and fs.rfe
.
Value
frankvali
returns an object of class including the components:
fs.method |
Feature ranking method used. |
cl.method |
Classification method used. |
fs.len |
Feature lengths used. |
fs.rank |
Final feature ranking. It is obtained based on |
err.all |
Error rate for all computation. |
err.iter |
Error rate for each iteration. |
err.avg |
Average error rate for all iterations. |
sampling |
Sampling scheme used. |
niter |
Number of iterations. |
nboot |
Number of bootstrap replications if the sampling method is
one of |
nfold |
Fold of cross-validations if the sampling is |
nrand |
Number of replications if the sampling is |
fs.list |
Feature list of all computation if |
fs.cl
and fs.cl.1
return a matrix with columns of acc
(accuracy),
auc
(area under ROC curve) and mar
(class margin).
Note
fs.cl
is the simplified version of frankvali
. Both frankvali
and fs.cl
are used for validation of aggregated features from top to
bottom only, but fs.cl.1
can be used for validation of either individual
or aggregated features.
Author(s)
Wanchang Lin
See Also
feat.rank.re
, frank.err
, valipars
,
boxplot.frankvali
, get.fs.len
Examples
data(abr1)
dat <- abr1$pos
x <- preproc(dat[,110:500], method="log10")
y <- factor(abr1$fact$class)
dat <- dat.sel(x, y, choices=c("1","2"))
x.1 <- dat[[1]]$dat
y.1 <- dat[[1]]$cls
len <- c(1:20,seq(25,50,5),seq(60,90,10),seq(100,300,50))
pars <- valipars(sampling="boot",niter=2, nreps=4)
res <- frankvali(x.1,y.1,cl.method = "knn", fs.method="fs.auc",
fs.len=len, pars = pars)
res
summary(res)
boxplot(res)
## Not run:
## or apply feature selection with re-sampling procedure at first
fs <- feat.rank.re(x.1,y.1,method="fs.auc",pars = pars)
## then estimate error of feature selection.
res.1 <- frankvali(x.1,y.1,cl.method = "knn", fs.order=fs$fs.order,
fs.len=len, pars = pars)
res.1
## use formula
data.bin <- data.frame(y.1,x.1)
pars <- valipars(sampling="cv",niter=2,nreps=4)
res.2 <- frankvali(y.1~., data=data.bin,fs.method="fs.rfe",fs.len=len,
cl.method = "knn",pars = pars)
res.2
## examples of fs.cl and fs.cl.1
fs <- fs.rf(x.1, y.1)
res.3 <- fs.cl(x.1,y.1,fs.order=fs$fs.order, fs.len=len,
cl.method = "svm", pars = pars, all.fs=TRUE)
ord <- fs$fs.order[1:50]
## aggregated features
res.4 <- fs.cl.1(x.1,y.1,fs.order=ord, cl.method = "svm", pars = pars,
agg_f=TRUE)
## individual feature
res.5 <- fs.cl.1(x.1,y.1,fs.order=ord, cl.method = "svm", pars = pars,
agg_f=FALSE)
## End(Not run)