R: Threshold selection method

MAX {mlquantify}

R Documentation

Threshold selection method

Description

It quantifies events based on testing scores, applying MAX method, according to Forman (2006). Same as T50, but it sets the threshold where tpr–fpr is maximized.

Usage

MAX(test, TprFpr)

Arguments

`test`	a numeric `vector` containing the score estimated for the positive class from each test set instance.
`TprFpr`	a `data.frame` of true positive (`tpr`) and false positive (`fpr`) rates estimated on training set, using the function `getTPRandFPRbyThreshold()`.

Value

A numeric vector containing the class distribution estimated from the test set.

References

Forman, G. (2006, August). Quantifying trends accurately despite classifier error and class imbalance. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 157-166).<doi.org/10.1145/1150402.1150423>.

Examples

library(randomForest)
library(caret)
cv <- createFolds(aeAegypti$class, 3)
tr <- aeAegypti[cv$Fold1,]
validation <- aeAegypti[cv$Fold2,]
ts <- aeAegypti[cv$Fold3,]

# -- Getting a sample from ts with 80 positive and 20 negative instances --
ts_sample <- rbind(ts[sample(which(ts$class==1),80),],
                   ts[sample(which(ts$class==2),20),])
scorer <- randomForest(class~., data=tr, ntree=500)
scores <- cbind(predict(scorer, validation, type = c("prob")), validation$class)
TprFpr <- getTPRandFPRbyThreshold(scores)
test.scores <- predict(scorer, ts_sample, type = c("prob"))
MAX(test=test.scores[,1], TprFpr=TprFpr)

[Package mlquantify version 0.2.0 Index]