misclassCost {CustomerScoringMetrics}R Documentation

Calculate misclassification cost

Description

Calculates the absolute misclassification cost value for a set of predictions.

Usage

misclassCost(predTest, depTest, costType = c("costRatio", "costMatrix",
  "costVector"), costs = NULL, cutoff = 0.5, dyn.cutoff = FALSE,
  predVal = NULL, depVal = NULL)

Arguments

predTest

Vector with predictions (real-valued or discrete)

depTest

Vector with real class labels

costType

An argument that specifies how the cost information is provided. This should be either "costRatio" or "costMatrix". In the former case, a single value is provided which reflects the cost ratio (the ratio of the cost associated with a false negative to the cost associated with a false positive). In the latter case, a full (4x4) misclassification cost matrix should be provided in the form rbind(c(0,3),c(15,0)) where in this example 3 is the cost for a false positive, and 15 the cost for a false negative case.

costs

see costType

cutoff

Threshold for converting real-valued predictions into class predictions. Default 0.5.

dyn.cutoff

Logical indicator to enable dynamic threshold determination using validation sample predictions. In this case, the function determines, using validation data, the indidicence (occurrence percentage of the customer behavior or characterstic of interest) and chooses a cutoff value so that the number of predicted positives is equal to the number of true positives. If TRUE, then the value for the cutoff parameter is ignored.

predVal

Vector with predictions (real-valued or discrete). Only used if dyn.cutoff is TRUE.

depVal

Optional vector with true class labels for validation data. Only used if dyn.cutoff is TRUE.

Value

A list with the following elements:

misclassCost

Total misclassification cost value

cutoff

the threshold value used to convert real-valued predictions to class predictions

Author(s)

Koen W. De Bock, kdebock@audencia.com

References

Witten, I.H., Frank, E. (2005): Data Mining: Practical Machine Learning Tools and Techniques, Second Edition. Chapter 5. Morgan Kauffman.

See Also

dynConfMatrix,expMisclassCost,dynAccuracy

Examples

## Load response modeling data set
data("response")
## Generate cost vector
costs <- runif(nrow(response$test), 1, 100)
## Apply misclassCost function to obtain the misclassification cost for the
## predictions for test sample. Assume a cost ratio of 5.
emc<-misclassCost(response$test[,2],response$test[,1],costType="costVector", costs=costs)
print(emc$EMC)


[Package CustomerScoringMetrics version 1.0.0 Index]