mcc {mlr3measures}R Documentation

Matthews Correlation Coefficient


Measure to compare true observed labels with predicted labels in multiclass classification tasks.


mcc(truth, response, positive = NULL, ...)



True (observed) labels. Must have the same levels and length as response.


Predicted response labels. Must have the same levels and length as truth.


(⁠character(1⁠) Name of the positive class in case of binary classification.


Additional arguments. Currently ignored.


In the binary case, the Matthews Correlation Coefficient is defined as

TPTNFPFN(TP+FP)(TP+FN)(TN+FP)(TN+FN), \frac{\mathrm{TP} \cdot \mathrm{TN} - \mathrm{FP} \cdot \mathrm{FN}}{\sqrt{(\mathrm{TP} + \mathrm{FP}) (\mathrm{TP} + \mathrm{FN}) (\mathrm{TN} + \mathrm{FP}) (\mathrm{TN} + \mathrm{FN})}},

where TPTP, FPFP, TNTN, TPTP are the number of true positives, false positives, true negatives, and false negatives respectively.

In the multi-class case, the Matthews Correlation Coefficient defined for a multi-class confusion matrix CC with KK classes:

cskKpktk(s2kKpk2)(s2kKtk2), \frac{c \cdot s - \sum_k^K p_k \cdot t_k}{\sqrt{(s^2 - \sum_k^K p_k^2) \cdot (s^2 - \sum_k^K t_k^2)}},


The above formula is undefined if any of the four sums in the denominator is 0 in the binary case and more generally if either s2sum(pk2)s^2 - sum(pk^2) or s2sum(tk2)s^2 - sum(tk^2) is equal to 0. The denominator is then set to 1. When there are more than two classes, the MCC will no longer range between -1 and +1. Instead, the minimum value will be between -1 and 0 depending on the true distribution. The maximum value is always +1.


Performance value as numeric(1).

Meta Information


Matthews BW (1975). “Comparison of the predicted and observed secondary structure of T4 phage lysozyme.” Biochimica et Biophysica Acta (BBA) - Protein Structure, 405(2), 442–451. doi:10.1016/0005-2795(75)90109-9.

See Also

Other Classification Measures: acc(), bacc(), ce(), logloss(), mauc_aunu(), mbrier(), zero_one()


lvls = c("a", "b", "c")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
mcc(truth, response)

[Package mlr3measures version 0.6.0 Index]