mlr_measures_classif.mcc {mlr3} | R Documentation |
Matthews Correlation Coefficient
Description
Measure to compare true observed labels with predicted labels in multiclass classification tasks.
Details
In the binary case, the Matthews Correlation Coefficient is defined as
\frac{\mathrm{TP} \cdot \mathrm{TN} - \mathrm{FP} \cdot \mathrm{FN}}{\sqrt{(\mathrm{TP} + \mathrm{FP}) (\mathrm{TP} + \mathrm{FN}) (\mathrm{TN} + \mathrm{FP}) (\mathrm{TN} + \mathrm{FN})}},
where TP
, FP
, TN
, TP
are the number of true positives, false positives, true negatives, and false negatives respectively.
In the multi-class case, the Matthews Correlation Coefficient defined for a multi-class confusion matrix C
with K
classes:
\frac{c \cdot s - \sum_k^K p_k \cdot t_k}{\sqrt{(s^2 - \sum_k^K p_k^2) \cdot (s^2 - \sum_k^K t_k^2)}},
where
-
s = \sum_i^K \sum_j^K C_{ij}
: total number of samples -
c = \sum_k^K C_{kk}
: total number of correctly predicted samples -
t_k = \sum_i^K C_{ik}
: number of predictions for each classk
-
p_k = \sum_j^K C_{kj}
: number of true occurrences for each classk
.
The above formula is undefined if any of the four sums in the denominator is 0 in the binary case and more generally if either s^2 - sum(pk^2)
or s^2 - sum(tk^2)
is equal to 0.
The denominator is then set to 1.
When there are more than two classes, the MCC will no longer range between -1 and +1.
Instead, the minimum value will be between -1 and 0 depending on the true distribution. The maximum value is always +1.
Dictionary
This Measure can be instantiated via the dictionary mlr_measures or with the associated sugar function msr()
:
mlr_measures$get("classif.mcc") msr("classif.mcc")
Parameters
Empty ParamSet
Meta Information
Type:
"classif"
Range:
[-1, 1]
Minimize:
FALSE
Required prediction:
response
Note
The score function calls mlr3measures::mcc()
from package mlr3measures.
If the measure is undefined for the input, NaN
is returned.
This can be customized by setting the field na_value
.
See Also
Dictionary of Measures: mlr_measures
as.data.table(mlr_measures)
for a complete table of all (also dynamically created) Measure implementations.
Other classification measures:
mlr_measures_classif.acc
,
mlr_measures_classif.auc
,
mlr_measures_classif.bacc
,
mlr_measures_classif.bbrier
,
mlr_measures_classif.ce
,
mlr_measures_classif.costs
,
mlr_measures_classif.dor
,
mlr_measures_classif.fbeta
,
mlr_measures_classif.fdr
,
mlr_measures_classif.fn
,
mlr_measures_classif.fnr
,
mlr_measures_classif.fomr
,
mlr_measures_classif.fp
,
mlr_measures_classif.fpr
,
mlr_measures_classif.logloss
,
mlr_measures_classif.mauc_au1p
,
mlr_measures_classif.mauc_au1u
,
mlr_measures_classif.mauc_aunp
,
mlr_measures_classif.mauc_aunu
,
mlr_measures_classif.mbrier
,
mlr_measures_classif.npv
,
mlr_measures_classif.ppv
,
mlr_measures_classif.prauc
,
mlr_measures_classif.precision
,
mlr_measures_classif.recall
,
mlr_measures_classif.sensitivity
,
mlr_measures_classif.specificity
,
mlr_measures_classif.tn
,
mlr_measures_classif.tnr
,
mlr_measures_classif.tp
,
mlr_measures_classif.tpr
Other multiclass classification measures:
mlr_measures_classif.acc
,
mlr_measures_classif.bacc
,
mlr_measures_classif.ce
,
mlr_measures_classif.costs
,
mlr_measures_classif.logloss
,
mlr_measures_classif.mauc_au1p
,
mlr_measures_classif.mauc_au1u
,
mlr_measures_classif.mauc_aunp
,
mlr_measures_classif.mauc_aunu
,
mlr_measures_classif.mbrier