measure {ebmc}R Documentation

Calculating Performance Measurement in Class Imbalance Problem

Description

The function is an interation of multiple performance measurements that can be used to assess model performance in class imbalance problem. Totally six measurements are included.

Usage

measure(label, probability, metric, threshold = 0.5)

Arguments

label

A vector of actual labels of target variable in test set.

probability

A vector of probability estimated by the model.

metric

Measurement used for assessing model performance. auc, gmean, tpr, tnr, f, and acc are available. Please see Details for more information.

threshold

Probability threshold for determining the class of instances. A numerical value ranging from 0 to 1. Default is 0.5

Details

This function integrates six common measurements. It uses pROC::roc() and pROC::auc() to calculate auc (Area Under Curve), while calculates other measurements without dependency on other package: gmean (Geometric Mean), tpr (True Positive Rate), tnr (True Negative Rate),and f (F-Measure).

acc (Accuracy) is also included for any possible use, although such measurement can be misleading when the classes of test set is highly imbalanced.

threshold is the probability cutoff for determing the predicted class of instances. For AUC, users do not need to specify threshold because AUC is not affected by the probability cutoff. However, the threshold is required for other five measurements.

Examples

data("iris")
iris <- iris[1:70, ]
iris$Species <- factor(iris$Species, levels = c("setosa", "versicolor"), labels = c("0", "1"))

# Creat training and test set
samp <- sample(nrow(iris), nrow(iris) * 0.7)
train <- iris[samp, ]
test <- iris[-samp, ]

# Model building and prediction
model <- rus(Species ~ ., data = train, size = 10, alg = "c50")
prob <- predict(model, newdata = test, type = "prob")

# Calculate measurements
auc <- measure(label = test$Species, probability = prob, metric = "auc")
gmean <- measure(label = test$Species, probability = prob, metric = "gmean", threshold = 0.5)

[Package ebmc version 1.0.1 Index]