R: Feature Ranking and Validation on Feature Subset

frank.err {mt}

R Documentation

Feature Ranking and Validation on Feature Subset

Description

Get feature ranking on the training data and validate selected feature subsets by estimating their classification error rate.

Usage

frank.err(dat.tr, cl.tr, dat.te, cl.te, cl.method="svm",
          fs.method="fs.auc", fs.order=NULL, fs.len="power2", ...)

Arguments

`dat.tr`	A data frame or matrix of training data. Feature ranking and classification model are carried on this data set.
`cl.tr`	A factor or vector of training class.
`dat.te`	A data frame or matrix of test data. Error rates are calculated on this data set.
`cl.te`	A factor or vector of test class.
`cl.method`	Classification method to be used. Any classification methods can be employed if they have method `predict` (except `knn`) with output of predicted class label or one component with name of `class` in the returned list, such as `randomForest`, `svm`, `knn` and `lda`.
`fs.method`	Feature ranking method. If `fs.order` is not `NULL`, it is ignored.
`fs.order`	A vector of feature order. Default is `NULL` and then the feature selection will be performed on the training data.
`fs.len`	The lengths of feature subsets used for validation. For details, see `get.fs.len`.
`...`	Additional parameters to `fs.method` or `cl.method`.

Value

A list with components:

`cl.method`	Classification method used.
`fs.len`	The lengths of feature subsets used for validation.
`error`	Error rate for each feature length.
`fs.method`	Feature ranking method used.
`fs.order`	Feature order vector.
`fs.rank`	Feature ranking score vector.

Author(s)

Wanchang Lin

Examples

data(abr1)
dat <- abr1$pos
x   <- preproc(dat[,110:500], method="log10")  
y   <- factor(abr1$fact$class)        

dat <- dat.sel(x, y, choices=c("1","6"))
x.1 <- dat[[1]]$dat
y.1 <- dat[[1]]$cls

idx <- sample(1:nrow(x.1), round((2/3)*nrow(x.1)), replace=FALSE) 
## construct train and test data 
train.dat  <- x.1[idx,]
train.cl   <- y.1[idx]
test.dat   <- x.1[-idx,]   
test.cl    <- y.1[-idx] 

## validate feature selection on some feature subsets
res <- frank.err(train.dat, train.cl, test.dat, test.cl, 
                 cl.method="knn", fs.method="fs.auc",  
                 fs.len="power2")
names(res)
## full feature order list
res$fs.order

## validation on subsets of feature order 
res$error

## or first apply feature selection
fs <- fs.auc(train.dat,train.cl)
## then apply error estimation for each selected feature subset
res.1 <- frank.err(train.dat, train.cl, test.dat, test.cl, 
                   cl.method="knn", fs.order=fs$fs.order,  
                   fs.len="power2")

res.1$error

[Package mt version 2.0-1.20 Index]