confusion_matrix {cvms} | R Documentation |
Create a confusion matrix
Description
Creates a confusion matrix from targets and predictions. Calculates associated metrics.
Multiclass results are based on one-vs-all evaluations.
Both regular averaging and weighted averaging are available. Also calculates the Overall Accuracy
.
Note: In most cases you should use evaluate()
instead. It has additional metrics and
works in magrittr
pipes (e.g. %>%
) and with dplyr::group_by()
.
confusion_matrix()
is more lightweight and may be preferred in programming when you don't need the extra stuff
in evaluate()
.
Usage
confusion_matrix(
targets,
predictions,
metrics = list(),
positive = 2,
c_levels = NULL,
do_one_vs_all = TRUE,
parallel = FALSE
)
Arguments
targets |
|
predictions |
|
metrics |
E.g. You can enable/disable all metrics at once by including
The Also accepts the string |
positive |
Level from E.g. if we have the levels Note: For reproducibility, it's preferable to specify the name directly, as
different |
c_levels |
N.B. the levels are sorted alphabetically. When |
do_one_vs_all |
Whether to perform one-vs-all evaluations when working with more than 2 classes (multiclass). If you are only interested in the confusion matrix, this allows you to skip most of the metric calculations. |
parallel |
Whether to perform the one-vs-all evaluations in parallel. (Logical) N.B. This only makes sense when you have a lot of classes or a very large dataset. Remember to register a parallel backend first.
E.g. with |
Details
The following formulas are used for calculating the metrics:
Sensitivity = TP / (TP + FN)
Specificity = TN / (TN + FP)
Pos Pred Value = TP / (TP + FP)
Neg Pred Value = TN / (TN + FN)
Balanced Accuracy = (Sensitivity + Specificity) / 2
Accuracy = (TP + TN) / (TP + TN + FP + FN)
Overall Accuracy = Correct / (Correct + Incorrect)
F1 = 2 * Pos Pred Value * Sensitivity / (Pos Pred Value + Sensitivity)
MCC = ((TP * TN) - (FP * FN)) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))
Note for MCC
: Formula is for the binary case. When the denominator is 0
,
we set it to 1
to avoid NaN
.
See the metrics
vignette for the multiclass version.
Detection Rate = TP / (TP + FN + TN + FP)
Detection Prevalence = (TP + FP) / (TP + FN + TN + FP)
Threat Score = TP / (TP + FN + FP)
False Neg Rate = 1 - Sensitivity
False Pos Rate = 1 - Specificity
False Discovery Rate = 1 - Pos Pred Value
False Omission Rate = 1 - Neg Pred Value
For Kappa the counts (TP
, TN
, FP
, FN
) are normalized to percentages (summing to 1).
Then the following is calculated:
p_observed = TP + TN
p_expected = (TN + FP) * (TN + FN) + (FN + TP) * (FP + TP)
Kappa = (p_observed - p_expected) / (1 - p_expected)
Value
tibble
with:
Nested confusion matrix (tidied version)
Nested confusion matrix (table)
The Positive Class.
Multiclass only: Nested Class Level Results with the two-class metrics, the nested confusion matrices, and the Support metric, which is a count of the class in the target column and is used for the weighted average metrics.
The following metrics are available (see `metrics`
):
Two classes or more
Metric | Name | Default |
Balanced Accuracy | "Balanced Accuracy" | Enabled |
Accuracy | "Accuracy" | Disabled |
F1 | "F1" | Enabled |
Sensitivity | "Sensitivity" | Enabled |
Specificity | "Specificity" | Enabled |
Positive Predictive Value | "Pos Pred Value" | Enabled |
Negative Predictive Value | "Neg Pred Value" | Enabled |
Kappa | "Kappa" | Enabled |
Matthews Correlation Coefficient | "MCC" | Enabled |
Detection Rate | "Detection Rate" | Enabled |
Detection Prevalence | "Detection Prevalence" | Enabled |
Prevalence | "Prevalence" | Enabled |
False Negative Rate | "False Neg Rate" | Disabled |
False Positive Rate | "False Pos Rate" | Disabled |
False Discovery Rate | "False Discovery Rate" | Disabled |
False Omission Rate | "False Omission Rate" | Disabled |
Threat Score | "Threat Score" | Disabled |
The Name column refers to the name used in the package.
This is the name in the output and when enabling/disabling in `metrics`
.
Three classes or more
The metrics mentioned above (excluding MCC
)
has a weighted average version (disabled by default; weighted by the Support).
In order to enable a weighted metric, prefix the metric name with "Weighted "
when specifying `metrics`
.
E.g. metrics = list("Weighted Accuracy" = TRUE)
.
Metric | Name | Default |
Overall Accuracy | "Overall Accuracy" | Enabled |
Weighted * | "Weighted *" | Disabled |
Multiclass MCC | "MCC" | Enabled |
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other evaluation functions:
binomial_metrics()
,
evaluate()
,
evaluate_residuals()
,
gaussian_metrics()
,
multinomial_metrics()
Examples
# Attach cvms
library(cvms)
# Two classes
# Create targets and predictions
targets <- c(0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1)
predictions <- c(1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0)
# Create confusion matrix with default metrics
cm <- confusion_matrix(targets, predictions)
cm
cm[["Confusion Matrix"]]
cm[["Table"]]
# Three classes
# Create targets and predictions
targets <- c(0, 1, 2, 1, 0, 1, 2, 1, 0, 1, 2, 1, 0)
predictions <- c(2, 1, 0, 2, 0, 1, 1, 2, 0, 1, 2, 0, 2)
# Create confusion matrix with default metrics
cm <- confusion_matrix(targets, predictions)
cm
cm[["Confusion Matrix"]]
cm[["Table"]]
# Enabling weighted accuracy
# Create confusion matrix with Weighted Accuracy enabled
cm <- confusion_matrix(targets, predictions,
metrics = list("Weighted Accuracy" = TRUE)
)
cm