classifier {mt} | R Documentation |
Wrapper Function for Classifiers
Description
Wrapper function for classifiers. The classification model is built up on the training data and error estimation is performed on the test data.
Usage
classifier(dat.tr, cl.tr, dat.te=NULL, cl.te=NULL, method,
pred.func=predict,...)
Arguments
dat.tr |
A data frame or matrix of training data. The classification model are built on it. |
cl.tr |
A factor or vector of training class. |
dat.te |
A data frame or matrix of test data. Error rates are calculated on this data set. |
cl.te |
A factor or vector of test class. |
method |
Classification method to be used. Any classification methods can be employed
if they have method |
pred.func |
Predict method (default is |
... |
Additional parameters to |
Value
A list including components:
err |
Error rate of test data. |
cl |
The original class of test data. |
pred |
The predicted class of test data. |
posterior |
Posterior probabilities for the classes if |
acc |
Accuracy rate of classification. |
margin |
The margin of predictions from classifier The margin of a data point is defined as the proportion of probability for the
correct class minus maximum proportion of probabilities for the other classes.
Positive margin means correct classification, and vice versa. This idea come
from package randomForest. For more details, see
|
auc |
The area under receiver operating curve (AUC) if classifier |
Note
The definition of margin is based on the posterior probabilities. Classifiers,
such as randomForest
, svm
,
lda
, qda
, pcalda
and
plslda
, do output posterior probabilities. But
knn
does not.
Author(s)
Wanchang Lin
See Also
Examples
data(abr1)
dat <- preproc(abr1$pos[,110:500], method="log10")
cls <- factor(abr1$fact$class)
## tmp <- dat.sel(dat, cls, choices=c("1","2"))
## dat <- tmp[[1]]$dat
## cls <- tmp[[1]]$cls
idx <- sample(1:nrow(dat), round((2/3)*nrow(dat)), replace = FALSE)
## constrcuct train and test data
train.dat <- dat[idx,]
train.cl <- cls[idx]
test.dat <- dat[-idx,]
test.cl <- cls[-idx]
## estimates accuracy
res <- classifier(train.dat, train.cl, test.dat, test.cl,
method="randomForest")
res
## get confusion matrix
cl.rate(obs=res$cl, res$pred) ## same as: cl.rate(obs=test.cl, res$pred)
## Measurements of Forensic Glass Fragments
data(fgl, package = "MASS") # in MASS package
dat <- subset(fgl, grepl("WinF|WinNF",type))
## dat <- subset(fgl, type %in% c("WinF", "WinNF"))
x <- subset(dat, select = -type)
y <- factor(dat$type)
## construct train and test data
idx <- sample(1:nrow(x), round((2/3)*nrow(x)), replace = FALSE)
tr.x <- x[idx,]
tr.y <- y[idx]
te.x <- x[-idx,]
te.y <- y[-idx]
res.1 <- classifier(tr.x, tr.y, te.x, te.y, method="svm")
res.1
cl.rate(obs=res.1$cl, res.1$pred)
## classification performance for the two-class case.
pos <- "WinF" ## select positive level
cl.perf(obs=res.1$cl, pre=res.1$pred, pos=pos)
## ROC and AUC
cl.roc(stat=res.1$posterior[,pos],label=res.1$cl, pos=pos)