IE {EPX}R Documentation

Calculate Initial Enhancement

Description

Calculates initial enhancement (IE), which is the precision at one specific shortlist length (cutoff) normalised by the proportion of relevants in the total sample size (Tomal et al. 2015). Since IE is a rescaling of precision, we expect IE and AHR to lead to similar conclusions as an assessment metric for the EPX algorithm.

Usage

IE(y, phat, cutoff = length(y)/2, ...)

Arguments

y

True (binary) response vector where 1 is the rare/relevant class.

phat

Numeric vector of estimated probabilities of relevance.

cutoff

Shortlist cutoff length, and so must not exceed length of y; default is half the sample size.

...

Further arguments passed to or from other methods.

Details

Let c be the cutoff and h(c) be the hitrate at c. Let also A be the total number of relevants and N be the total number of observations. IE is defined as

IE = h(c) / (A / N)

IE calculation does not change whether there are ties in phat or not.

Value

Numeric value of IE.

References

Tomal, J. H., Welch, W. J., & Zamar, R. H. (2015). Ensembling classification models based on phalanxes of variables with applications in drug discovery. The Annals of Applied Statistics, 9(1), 69-93. doi: 10.1214/14-AOAS778

Examples

## IE when there are no ties in phat:

resp <- c(1, 1, 0,   0,   0,   0,   0,    1,   0, 0)
prob <- (10:1) * 0.1
IE(y = resp, phat = prob, cutoff = 3)
# expect answer: (2/3) / (3/10)

## IE when there are ties
resp <- c(1, 1, 0,   0,   0,   0,   0,    1,   0, 0)
prob <- c(1, 1, 1, 0.4, 0.4, 0.3, 0.2, 0.15, 0.1, 0)
IE(y = resp, phat = prob, cutoff = 3)

# expect answer: same as above

[Package EPX version 1.0.4 Index]