| binda.ranking {binda} | R Documentation |
Binary Discriminant Analysis: Variable Ranking
Description
binda.ranking determines a ranking of predictors by computing corresponding t-scores
between the group means and the pooled mean.
plot.binda.ranking provides a graphical visualization of the top ranking variables
Usage
binda.ranking(Xtrain, L, lambda.freqs, verbose=TRUE)
## S3 method for class 'binda.ranking'
plot(x, top=40, arrow.col="blue", zeroaxis.col="red", ylab="Variables", main, ...)
Arguments
Xtrain |
A matrix containing the training data set. Note that the rows correspond to observations and the columns to variables. |
L |
A factor with the class labels of the training samples. |
lambda.freqs |
Shrinkage intensity for the class frequencies. If not specified it is
estimated from the data. |
verbose |
Print out some info while computing. |
x |
A "binda.ranking" object – this is produced by the binda.ranking() function. |
top |
The number of top-ranking variables shown in the plot (default: 40). |
arrow.col |
Color of the arrows in the plot (default is |
zeroaxis.col |
Color for the center zero axis (default is |
ylab |
Label written next to feature list (default is |
main |
Main title (if missing, |
... |
Other options passed on to generic plot(). |
Details
The overall ranking of a feature is determined by computing a weighted sum of
the squared t-scores. This is approximately equivalent to the mutual information between the response and each variable. The same criterion is used in dichotomize. For precise details see Gibb and Strimmer (2015).
Value
binda.ranking returns a matrix with the following columns:
idx |
original feature number |
score |
the score determining the overall ranking of a variable |
t |
for each group and feature the t-score of the class mean versus the pooled mean |
Author(s)
Sebastian Gibb and Korbinian Strimmer (https://strimmerlab.github.io).
References
Gibb, S., and K. Strimmer. 2015. Differential protein expression and peak selection in mass spectrometry data by binary discriminant analysis. Bioinformatics 31:3156-3162. <DOI:10.1093/bioinformatics/btv334>
See Also
binda, predict.binda, dichotomize.
Examples
# load "binda" library
library("binda")
# training data set with labels
Xtrain = matrix(c(1, 1, 0, 1, 0, 0,
1, 1, 1, 1, 0, 0,
1, 0, 0, 0, 1, 1,
1, 0, 0, 0, 1, 1), nrow=4, byrow=TRUE)
colnames(Xtrain) = paste0("V", 1:ncol(Xtrain))
is.binaryMatrix(Xtrain) # TRUE
L = factor(c("Treatment", "Treatment", "Control", "Control") )
# ranking variables
br = binda.ranking(Xtrain, L)
br
# idx score t.Control t.Treatment
#V2 2 4.000000 -2.000000 2.000000
#V4 4 4.000000 -2.000000 2.000000
#V5 5 4.000000 2.000000 -2.000000
#V6 6 4.000000 2.000000 -2.000000
#V3 3 1.333333 -1.154701 1.154701
#V1 1 0.000000 0.000000 0.000000
#attr(,"class")
#[1] "binda.ranking"
#attr(,"cl.count")
#[1] 2
# show plot
plot(br)
# result: variable V1 is irrelevant for distinguishing the two groups