classificationError {optBiomarker} | R Documentation |
Estimation of misclassification errors (generalisation errors) based on statistical and various machine learning methods
Description
Estimates misclassification errors (generalisation errors), sensitivity and specificity using cross-validation,
bootstrap and 632plus
bias corrected bootstrap methods based on Random Forest,
Support Vector Machines, Linear Discriminant Analysis and k-Nearest Neighbour methods.
Usage
## S3 method for class 'data.frame'
classificationError(
formula,
data,
method=c("RF","SVM","LDA","KNN"),
errorType = c("cv", "boot", "six32plus"),
senSpec=TRUE,
negLevLowest=TRUE,
na.action=na.omit,
control=control.errorest(k=NROW(na.action(data)),nboot=100),
...)
Arguments
formula |
A formula of the form |
data |
A data frame containing the response (class membership) variable and the explanatory variables in the formula. |
method |
A character vector of length |
errorType |
A character vector of length |
senSpec |
Logical. Should sensitivity and specificity (for cross-validation estimator only)
be computed? Defaults to |
negLevLowest |
Logical. Is the lowest of the ordered levels of the class variable represnts
the negative control? Defaults to |
na.action |
Function which indicates what should happen when the data
contains |
control |
Control parameters of the the function |
... |
additional parameters to |
Details
In the current version of the package, estimation of sensitivity and
specificity is limited to cross-validation estimator only. For LDA
sample size must be greater than the number of explanatory variables to
avoid singularity. The function classificationError
does not
check if this is satisfied, but the underlying function
lda
produces warnings if this condition is violated.
Value
Returns an object of class classificationError
with components
call |
The call of the |
errorRate |
A |
rocData |
A |
Author(s)
Mizanur Khondoker, Till Bachmann, Peter Ghazal
Maintainer: Mizanur Khondoker mizanur.khondoker@gmail.com.
References
Khondoker, M. R., Till T. Bachmann, T. T., Mewissen, M., Dickinson, P. et al.(2010). Multi-factorial analysis of class prediction error: estimating optimal number of biomarkers for various classification rules. Journal of Bioinformatics and Computational Biology, 8, 945-965.
Breiman, L. (2001). Random Forests, Machine Learning 45(1), 5–32.
Chang, Chih-Chung and Lin, Chih-Jen: LIBSVM: a library for Support Vector Machines, https://www.csie.ntu.edu.tw/~cjlin/libsvm/.
Ripley, B. D. (1996). Pattern Recognition and Neural Networks.Cambridge: Cambridge University Press.
Efron, B. and Tibshirani, R. (1997). Improvements on Cross-Validation: The .632+ Bootstrap Estimator. Journal of the American Statistical Association 92(438), 548–560.
See Also
Examples
## Not run:
mydata<-simData(nTrain=30,nBiom=3)$data
classificationError(formula=class~., data=mydata)
## End(Not run)