errormatrix {klaR} | R Documentation |
Tabulation of prediction errors by classes
Description
Cross-tabulates true and predicted classes with the option to show relative frequencies.
Usage
errormatrix(true, predicted, relative = FALSE)
Arguments
true |
Vector of true classes. |
predicted |
Vector of predicted classes. |
relative |
Logical. If |
Details
Given vectors of true and predicted classes, a (symmetric) table of misclassifications is constructed.
Element [i,j] shows the number of objects of class i that were classified as class j; so the main diagonal shows the correct classifications. The last row and column show the corresponding sums of misclassifications, the lower right element is the total sum of misclassifications.
If ‘relative
’ is TRUE
, the rows are
normalized so they show relative frequencies instead. The
lower right element now shows the total error rate, and the
remaining last row sums up to one, so it shows “where the
misclassifications went”.
Value
A (named) matrix.
Note
Concerning the case that ‘relative
’ is TRUE
:
If a prior distribution over the classes is given, the misclassification rate that is returned as the lower right element (which is only the fraction of misclassified data) is not an estimator for the expected misclassification rate.
In that case you have to multiply the individual error rates for each class (returned in the last column) with the corresponding prior probabilities and sum these up (see example below).
Both error rate estimates are equal, if the fractions of classes in the data are equal to the prior probabilities.
Author(s)
Christian Röver, roever@statistik.tu-dortmund.de
See Also
Examples
data(iris)
library(MASS)
x <- lda(Species ~ Sepal.Length + Sepal.Width, data=iris)
y <- predict(x, iris)
# absolute numbers:
errormatrix(iris$Species, y$class)
# relative frequencies:
errormatrix(iris$Species, y$class, relative = TRUE)
# percentages:
round(100 * errormatrix(iris$Species, y$class, relative = TRUE), 0)
# expected error rate in case of class prior:
indiv.rates <- errormatrix(iris$Species, y$class, relative = TRUE)[1:3, 4]
prior <- c("setosa" = 0.2, "versicolor" = 0.3, "virginica" = 0.5)
total.rate <- t(indiv.rates) %*% prior
total.rate