confusion {mlearning} | R Documentation |
Construct and analyze confusion matrices
Description
Confusion matrices compare two classifications (usually one done automatically using a machine learning algorithm versus the true classification done by a specialist... but one can also compare two automatic or two manual classifications against each other).
Usage
confusion(x, ...)
## Default S3 method:
confusion(
x,
y = NULL,
vars = c("Actual", "Predicted"),
labels = vars,
merge.by = "Id",
useNA = "ifany",
prior,
...
)
## S3 method for class 'mlearning'
confusion(
x,
y = response(x),
labels = c("Actual", "Predicted"),
useNA = "ifany",
prior,
...
)
## S3 method for class 'confusion'
print(x, sums = TRUE, error.col = sums, digits = 0, sort = "ward.D2", ...)
## S3 method for class 'confusion'
summary(object, type = "all", sort.by = "Fscore", decreasing = TRUE, ...)
## S3 method for class 'summary.confusion'
print(x, ...)
Arguments
x |
an object with a |
... |
further arguments passed to the method. |
y |
another object, from which to extract the second classification, or
|
vars |
the variables of interest in the first and second classification
in the case the objects are lists or data frames. Otherwise, this argument
is ignored and |
labels |
labels to use for the two classifications. By default, they are
the same as |
merge.by |
a character string with the name of variables to use to merge
the two data frames, or |
useNA |
do we keep |
prior |
class frequencies to use for first classifier that is tabulated
in the rows of the confusion matrix. For its value, see here under, the
|
sums |
is the confusion matrix printed with rows and columns sums? |
error.col |
is a column with class error for first classifier added (equivalent to false negative rate of FNR)? |
digits |
the number of digits after the decimal point to print in the confusion matrix. The default or zero leads to most compact presentation and is suitable for frequencies, but not for relative frequencies. |
sort |
are rows and columns of the confusion matrix sorted so that
classes with larger confusion are closer together? Sorting is done
using a hierarchical clustering with |
object |
a confusion object |
type |
either |
sort.by |
the statistics to use to sort the table (by default, Fmeasure, the F1 score for each class = 2 * recall * precision / (recall + precision)). |
decreasing |
do we sort in increasing or decreasing order? |
Value
A confusion matrix in a confusion object.
See Also
mlearning()
, plot.confusion()
, prior()
Examples
data("Glass", package = "mlbench")
# Use a little bit more informative labels for Type
Glass$Type <- as.factor(paste("Glass", Glass$Type))
# Use learning vector quantization to classify the glass types
# (using default parameters)
summary(glass_lvq <- ml_lvq(Type ~ ., data = Glass))
# Calculate cross-validated confusion matrix
(glass_conf <- confusion(cvpredict(glass_lvq), Glass$Type))
# Raw confusion matrix: no sort and no margins
print(glass_conf, sums = FALSE, sort = FALSE)
summary(glass_conf)
summary(glass_conf, type = "Fscore")