EMQ {mlquantify}R Documentation

Expectation-Maximization Quantification

Description

This method is an instance of the well-known algorithm for finding maximum-likelihood estimates of the model's parameters. It quantifies events based on testing scores, applying the Expectation Maximization for Quantification (EMQ) method proposed by Saerens et al. (2002).

Usage

EMQ(train, test, it=5, e=1e-4)

Arguments

train

a data.frame of the labeled set.

test

a numeric matrix of scores predicted from each test set instance. First column must be the positive score.

it

maximum number of iteration steps (default 5).

e

a numeric value for the stop threshold (default 1e-4). If the difference between two consecutive steps is lower or equal than e, the iterative process will be stopped. If e is null then the iteration phase is defined by the it parameter.

Value

A numeric vector containing the class distribution estimated from the test set.

References

Saerens, M., Latinne, P., & Decaestecker, C. (2002). Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure. Neural computation.<doi.org/10.1162/089976602753284446>.

Examples

library(randomForest)
library(caret)
cv <- createFolds(aeAegypti$class, 2)
tr <- aeAegypti[cv$Fold1,]
ts <- aeAegypti[cv$Fold2,]

# -- Getting a sample from ts with 80 positive and 20 negative instances --
ts_sample <- rbind(ts[sample(which(ts$class==1),80),],
                   ts[sample(which(ts$class==2),20),])
scorer <- randomForest(class~., data=tr, ntree=500)
test.scores <- predict(scorer, ts_sample, type = c("prob"))
EMQ(train=tr, test=test.scores)

[Package mlquantify version 0.2.0 Index]