perfScores {MKclass} | R Documentation |
Compute Performance Scores for Binary Classification
Description
The function computes various performance scores for binary classification.
Usage
perfScores(pred, truth, namePos, wBS = 0.5, scores = "all", transform = FALSE)
Arguments
pred |
numeric values that shall be used for classification; e.g. probabilities to belong to the positive group. |
truth |
true grouping vector or factor. |
namePos |
value representing the positive group. |
wBS |
weight used for computing the weighted Brier score (BS), where
postive BS is multiplied by |
scores |
character vector giving the scores that shall be computed;
see details. Default |
transform |
logical value indicating whether the values in |
Details
The function perfScores
can be used to compute various performance
scores. For computing specific scores, the abbreviation given in
parentheses have to be specified in argument scores
. Single scores
can also be computed by respective functions, where their names are identical
to the abbreviations given in the parentheses.
The available scores are: area under the ROC curve (AUC), Gini index (GINI), Brier score (BS), positive Brier score (PBS), negative Brier score (NBS), weighted Brier score (WBS), balanced Brier score (BBS), Brier skill score (BSS).
If the predictions (pred
) are not in the interval [0,1], the various
Brier scores are not valid. By setting argument transform
to TRUE
,
a simple logistic regression model is fit to the provided data and the
predicted values are used for the computations.
Value
data.frame
with names of the scores and their respective values.
Author(s)
Matthias Kohl Matthias.Kohl@stamats.de
References
G.W. Brier (1950). Verification of forecasts expressed in terms of probability. Mon. Wea. Rev. 78, 1-3.
T. Fawcett (2006). An introduction to ROC analysis. Pattern Recognition Letters 27, 861-874.
T.A. Gerds, T. Cai, M. Schumacher (2008). The performance of risk prediction models. Biom J 50, 457-479.
D. Hand, R. Till (2001). A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45, 171-186.
J. Hernandez-Orallo, P.A. Flach, C. Ferri (2011). Brier curves: a new cost- based visualisation of classifier performance. In L. Getoor and T. Scheffer (eds.) Proceedings of the 28th International Conference on Machine Learning (ICML-11), 585???592 (ACM, New York, NY, USA).
J. Hernandez-Orallo, P.A. Flach, C. Ferri (2012). A unified view of performance metrics: Translating threshold choice into expected classification loss. J. Mach. Learn. Res. 13, 2813-2869.
B.W. Matthews (1975). Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure 405, 442-451.
See Also
Examples
## example from dataset infert
fit <- glm(case ~ spontaneous+induced, data = infert, family = binomial())
pred <- predict(fit, type = "response")
## with group numbers
perfScores(pred, truth = infert$case, namePos = 1)
## with group names
my.case <- factor(infert$case, labels = c("control", "case"))
perfScores(pred, truth = my.case, namePos = "case")
## on the scale of the linear predictors
pred2 <- predict(fit)
perfScores(pred2, truth = infert$case, namePos = 1)
## using weights
perfScores(pred, truth = infert$case, namePos = 1, wBS = 0.3)