errorDbase {optBiomarker} | R Documentation |
Database of leave-one-out cross validation errors for various combinations of data characteristics
Description
This is a 7-dimensional array (database) of leave-one-out cross
validation errors for Random Forest, Support Vector Machines, Linear
Discriminant Analysis and k-Nearest Neighbour classifiers. The
database is the basis for estimating the optimal number of biomarkers
at a given error tolerance level using optimiseBiomarker
function. See Details for more information.
Usage
data(errorDbase)
Format
7-dimensional numeric array.
Details
The following table gives the dimension names, lengths and values/levels
of the data object errorDbase
.
Dimension name | Length | Values/Levels |
No. of biomarkers | 14 | (1-6, 7, 9, 11, 15, 20, 30, 40, 50, 100) |
Size of replication | 5 | (1, 3, 5, 7, 10) |
Biological variation (\sigma_b ) | 4 | (0.5, 1.0, 1.5, 2.5) |
Experimental variation (\sigma_e ) | 4 | (0.1, 0.5, 1.0, 1.5) |
Minimum (Average) fold change | 4 | (1 (1.73), 2(2.88), 3(4.03), 5(6.33)) |
Training set size | 5 | (10, 20, 50, 100, 250) |
Classification method | 3 | (Random Forest, Support Vector Machine, k-Nearest Neighbour) |
We have a plan to expand the database to a 8-dimensional one by
adding another dimension to store error rates at different level
of correlation between biomarkers. Length of each dimension will
also be increased leading to a bigger database with a wider coverage
of the parameter space. Current version of the database contain error rates
for independent (correlation = 0) biomarkers only. Also, it does not
contain error rates for Linear Discriminant Analysis, which we plan
to implement in the next release of the package. With the current
version of the database, optimal number of biomarkers can be
estimated using the optimiseBiomarker
function for any intermediate values of the factors represented by
the dimensions of the database.
Author(s)
Mizanur Khondoker, Till Bachmann, Peter Ghazal
Maintainer: Mizanur Khondoker mizanur.khondoker@gmail.com.
References
Khondoker, M. R., Till T. Bachmann, T. T., Mewissen, M., Dickinson, P. et al.(2010). Multi-factorial analysis of class prediction error: estimating optimal number of biomarkers for various classification rules. Journal of Bioinformatics and Computational Biology, 8, 945-965.