R: Classification Performance Measures

cpm {cpfa}

R Documentation

Classification Performance Measures

Description

Calculates multiple performance measures for binary or multiclass classification. Uses known class labels and evaluates against predicted labels.

Usage

cpm(x, y, level = NULL, fbeta = NULL, prior = NULL)

Arguments

`x`	Known class labels of class numeric, factor, or integer. If factor, converted to class integer in the order of factor levels with integers beginning at 0 (i.e., for binary classification, factor levels become 0 and 1; for multiclass, levels become 0, 1, 2, etc.).
`y`	Predicted class labels of class numeric, factor, or integer. If factor, converted to class integer in the order of factor levels with integers beginning at 0 (i.e., for binary classification, factor levels become 0 and 1; for multiclass, 0, 1, 2, etc.).
`level`	Optional argument specifying possible class labels. For cases when `x` or `y` do not contain all possible classes. Can be of class numeric, integer, or character. Must contain two elements for binary classification, and contain three or more elements for multiclass classification. If integer, integers should be ordered (e.g., binary with `c(0, 1)`; or three-class with `c(0, 1, 2)`). Note: if both `x` and `y` jointly contain only a single value (e.g., 1), must specify argument `level` in order to identify classification as binary or multiclass.
`fbeta`	Optional numeric argument specifying beta value for F-score. Defaults to `fbeta = 1`, providing an F1-score (i.e., the balanced harmonic mean between precision and recall). Can be any real number.
`prior`	Optional numeric argument specifying weights for classes. Currently only implemented with multiclass problems. Defaults to `prior = c(rep(1/llev, llev))`, where `llev` is the number of classes, providing equal importance across classes.

Details

Selecting one class as a negative class and one class as a positive class, binary classification generates four possible outcomes: (1) negative cases classified as positives, called false positives (FP); (2) negative cases classified as negatives, called true negatives (TN); (3) positive cases classified as negatives, called false negatives (FN); and (4) positive cases classified as positives, called true positives (TP).

Multiple evaluation measures are calculated using these four outcomes. Measures include: overall error (ERR), also called fraction incorrect; overall accuracy (ACC), also called fraction correct; true positive rate (TPR), also called recall, hit rate, or sensitivity; false negative rate (FNR), also called miss rate; false positive rate (FPR), also called fall-out; true negative rate (TNR), also called specificity or selectivity; positive predictive value (PPV), also called precision; false discovery rate (FDR); negative predictive value (NPV); false omission rate (FOR); and F-score (FS).

In multiclass classification, the four outcomes are possible for each individual class in macro-averaging, and performance measures are averaged over classes. Macro-averaging gives equal importance to all classes. For multiclass classification, calculated measures are currently only macro-averaged. See the listed reference in this help file for additional details on micro-averaging.

For binary classification, this function assumes a negative class and a positive class (i.e., it contains a reference group) and is ordered. Multiclass classification is currently assumed to be unordered.

Computational details:

ERR = (FP + FN) / (TP + TN + FP + FN).

ACC = (TP + TN) / (TP + TN + FP + FN), and ACC = 1 - ERR.

TPR = TP / (TP + FN).

FNR = FN / (FN + TP), and FNR = 1 - TPR.

FPR = FP / (FP + TN).

TNR = TN / (TN + FP), and TNR = 1 - FPR.

PPV = TP / (TP + FP).

FDR = FP / (FP + TP), and FDR = 1 - PPV.

NPV = TN / (TN + FN).

FOR = FN / (FN + TN), and FOR = 1 - NPV.

FS = (1 + beta^2) * ((PPV * TPR) / (((beta^2)*PPV) + TPR)).

All performance measures calculated are between 0 and 1, inclusive. For multiclass classification, macro-averaged values are provided for each performance measure. Note that 'beta' in FS represents the relative weight such that recall (TPR) is beta times more important than precision (PPV). See reference for more details.

Value

Returns list where first element is a full confusion matrix cm and where the second element is a data frame containing performance measures. For multiclass classification, macro-averaged values are provided (i.e., each measure is calculated for each class, then averaged over all classes; the average is weighted by argument prior if provided). The second list element contains the following performance measures:

`cm`	A confusion matrix with counts for each of the possible outcomes.
`err`	Overall error (ERR). Also called fraction incorrect.
`acc`	Overall accuracy (ACC). Also called fraction correct.
`tpr`	True positive rate (TPR). Also called recall, hit rate, or sensitivity.
`fpr`	False positive rate (FPR). Also called fall-out.
`tnr`	True negative rate (TNR). Also called specificity or selectivity.
`fnr`	False negative rate (FNR). Also called miss rate.
`ppv`	Positive predictive value (PPV). Also called precision.
`npv`	Negative predicted value (NPV).
`fdr`	False discovery rate (FDR).
`fom`	False omission rate (FOR).
`fs`	F-score. Mean between TPR (recall) and PPV (precision) varying by importance given to recall over precision (see Details section and argument `fbeta`).

Author(s)

Matthew Snodgress <snodg031@umn.edu>

References

Sokolova, M. and Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing and Management, 45(4), 427-437.

Examples

########## Parafac example with 3-way array and binary response ##########

# set seed and specify dimensions of a three-way tensor
set.seed(3)
mydim <- c(10, 11, 80)
nf <- 3

# create correlation matrix between response and third mode's weights 
rho.cc <- .35 
rho.cy <- .75 
cormat.values <- c(1, rho.cc, rho.cc, rho.cy, rho.cc, 1, rho.cc, rho.cy, 
                   rho.cc, rho.cc, 1, rho.cy, rho.cy, rho.cy, rho.cy, 1)
cormat <- matrix(cormat.values, nrow = (nf + 1), ncol = (nf + 1))

# sample from a multivariate normal with specified correlation structure
ymean <- Cmean <- 2
mu <- as.matrix(c(Cmean, Cmean, Cmean, ymean))
eidecomp <- eigen(cormat, symmetric = TRUE)
L.sqrt <- diag(eidecomp$values^0.5)
cormat.sqrt <- eidecomp$vectors %*% L.sqrt %*% t(eidecomp$vectors)
Z <- matrix(rnorm(mydim[3]*(nf + 1)), nrow = mydim[3], ncol = (nf + 1))
Xw <- rep(1, mydim[3]) %*% t(mu) + Z %*% cormat.sqrt
Cmat <- Xw[, 1:nf]

# create a random three-way data tensor with C weights related to a response
Amat <- matrix(rnorm(mydim[1]*nf), nrow = mydim[1], ncol = nf)
Bmat <- matrix(runif(mydim[2]*nf), nrow = mydim[2], ncol = nf)
Xmat <- tcrossprod(Amat, krprod(Cmat, Bmat))
Xmat <- array(Xmat, dim = mydim)
Emat <- array(rnorm(prod(mydim)), dim = mydim)
Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat))  
X <- Xmat + Emat

# create a binary response by dichotomizing at the specified response mean
y <- factor(as.numeric(Xw[ , (nf + 1)] > ymean))

# initialize
gamma <- c(0, 0.01)
cost <- c(1, 2)
method <- c("SVM")
family <- "binomial"
parameters <- list(gamma = gamma, cost = cost)
model <- "parafac"
nfolds <- 3
nstart <- 3

# constrain first mode weights to be orthogonal
const <- c("orthog", "uncons", "uncons")

# fit Parafac models and use third mode to tune classification methods
tune.object <- tunecpfa(x = X, y = y, model = model, nfac = nf, 
                        nfolds = nfolds, method = method, family = family, 
                        parameters = parameters, parallel = FALSE, 
                        const = const, nstart = nstart)
                         
# create new data with Parafac structure and C weights related to response
mydim.new <- c(10, 11, 20)
Znew <- matrix(rnorm(mydim.new[3]*(nf + 1)), 
               nrow = mydim.new[3], ncol = (nf + 1))
Xwnew <- rep(1, mydim.new[3]) %*% t(mu) + Znew %*% cormat.sqrt
Cmatnew <- Xwnew[, 1:nf]
Xnew0 <- tcrossprod(Amat, krprod(Cmatnew, Bmat))
Xnew0 <- array(Xnew0, dim = mydim.new)
Ematnew <- array(rnorm(prod(mydim.new)), dim = mydim.new)
Ematnew <- nscale(Ematnew, 0, ssnew = sumsq(Xnew0))  
Xnew <- Xnew0 + Ematnew

# create new random class labels for two levels
newlabel <- as.numeric(Xwnew[, (nf + 1)] > ymean)

# predict class labels
predict.labels <- predict(object = tune.object, newdata = Xnew, 
                          type = "response")
                        
# calculate performance measures for predicted class labels
y.pred <- predict.labels[, 1]
evalmeasure <- cpm(x = newlabel, y = y.pred)

# print performance measures
evalmeasure

[Package cpfa version 1.1-4 Index]