accest {mt} | R Documentation |
Estimate Classification Accuracy By Resampling Method
Description
Estimate classification accuracy rate by resampling method.
Usage
accest(dat, ...)
## Default S3 method:
accest(dat, cl, method, pred.func=predict,pars = valipars(),
tr.idx = NULL, ...)
## S3 method for class 'formula'
accest(formula, data = NULL, ..., subset, na.action = na.omit)
aam.cl(x,y,method, pars = valipars(),...)
aam.mcl(x,y,method, pars = valipars(),...)
Arguments
formula |
A formula of the form |
data |
Data frame from which variables specified in |
dat , x |
A matrix or data frame containing the explanatory variables if no formula is given as the principal argument. |
cl , y |
A factor specifying the class for each observation if no formula principal argument is given. |
method |
Classification method whose accuracy rate is to be estimated, such as
|
pred.func |
Predict method (default is |
pars |
A list of parameters using by the resampling method such as
Leave-one-out cross-validation, Cross-validation,
Bootstrap and Randomised validation (holdout).
See |
tr.idx |
User defined index of training samples. Can be generated by |
... |
Additional parameters to |
subset |
Optional vector, specifying a subset of observations to be used. |
na.action |
Function which indicates what should happen when the data
contains |
Details
The accuracy rates of classification are estimated by techniques such as Random Forest, Support Vector Machine, k-Nearest Neighbour Classification and Linear Discriminant Analysis based on resampling methods, including Leave-one-out cross-validation, Cross-validation, Bootstrap and Randomised validation (holdout).
Value
accest
returns an object including the components:
method |
Classification method used. |
acc |
Overall accuracy rate. |
acc.iter |
Average accuracy rate for each iteration. |
acc.all |
Accuracy rate for each iteration and replication. |
auc |
Overall area under receiver operating curve (AUC). |
auc.iter |
Average AUC for each iteration. |
auc.all |
AUC for each iteration and replication. |
mar |
Overall prediction margin. |
mar.iter |
Average prediction margin for each iteration. |
mar.all |
Prediction margin for each iteration and replication. |
err |
Overall error rate. |
err.iter |
Average error rate for each iteration. |
err.all |
Error rate for each iteration and replication. |
sampling |
Sampling scheme used. |
niter |
Number of iteration. |
nreps |
Number of replications in each iteration if resampling is
not |
conf |
Overall confusion matrix. |
res.all |
All results which can be further processed. |
acc.boot |
A list of bootstrap accuracy such as |
aam.cl
returns a vector with acc
(accuracy),
auc
(area under ROC curve) and mar
(class margin).
aam.mcl
returns a matrix with columns of acc
(accuracy),
auc
(area under ROC curve) and mar
(class margin).
Note
The accest
can take any classification models if their argument
format is model(formula, data, subset, na.action, ...)
and
their corresponding method predict.model(object, newdata, ...)
can either return the only predicted class label or a list with a
component called class
, such as lda
and pcalda
.
If classifier method
provides posterior probabilities, the
prediction margin mar
will be generated, otherwise NULL
.
If classifier method
provides posterior probabilities and the
classification is for two-class problem, auc
will be generated,
otherwise NULL
.
aam.cl
is a wrapper function of accest
, returning
accuracy rate, AUC and classification margin. aam.mcl
accepts
multiple classifiers in one running.
Author(s)
Wanchang Lin
See Also
binest
, maccest
, valipars
,
trainind
, classifier
Examples
# Iris data
data(iris)
# Use KNN classifier and bootstrap for resampling
acc <- accest(Species~., data = iris, method = "knn",
pars = valipars(sampling = "boot",niter = 2, nreps=5))
acc
summary(acc)
acc$acc.boot
# alternatively the traditional interface:
x <- subset(iris, select = -Species)
y <- iris$Species
## -----------------------------------------------------------------------
# Random Forest with 5-fold stratified cv
pars <- valipars(sampling = "cv",niter = 4, nreps=5, strat=TRUE)
tr.idx <- trainind(y,pars=pars)
acc1 <- accest(x, y, method = "randomForest", pars = pars, tr.idx=tr.idx)
acc1
summary(acc1)
# plot the accuracy in each iteration
plot(acc1)
## -----------------------------------------------------------------------
# Forensic Glass data in chap.12 of MASS
data(fgl, package = "MASS") # in MASS package
# Randomised validation (holdout) of SVM for fgl data
acc2 <- accest(type~., data = fgl, method = "svm", cost = 100, gamma = 1,
pars = valipars(sampling = "rand",niter = 10, nreps=4,div = 2/3) )
acc2
summary(acc2)
# plot the accuracy in each iteration
plot(acc2)
## -----------------------------------------------------------------------
## Examples of amm.cl and aam.mcl
aam.1 <- aam.cl(x,y,method="svm",pars=pars)
aam.2 <- aam.mcl(x,y,method=c("svm","randomForest"),pars=pars)
## If use two classes, AUC will be calculated
idx <- (y == "setosa")
aam.3 <- aam.cl(x[!idx,],factor(y[!idx]),method="svm",pars=pars)
aam.4 <- aam.mcl(x[!idx,],factor(y[!idx]),method=c("svm","randomForest"),pars=pars)