R: Eskin (ES) Measure

eskin {nomclust}

R Documentation

Eskin (ES) Measure

Description

The function calculates a dissimilarity matrix based on the ES similarity measure.

Usage

eskin(data, var.weights = NULL)

Arguments

`data`	A data.frame or a matrix with cases in rows and variables in columns.
`var.weights`	A numeric vector setting weights to the used variables. One can choose the real numbers from zero to one.

Details

The Eskin similarity measure was proposed by Eskin et al. (2002) and examined by Boriah et al., (2008). It is constructed to assign higher weights to mismatches on variables with more categories.

Value

The function returns an object of the class "dist".

Author(s)

Zdenek Sulc.
Contact: zdenek.sulc@vse.cz

References

Boriah S., Chandola V., Kumar V. (2008). Similarity measures for categorical data: A comparative evaluation. In: Proceedings of the 8th SIAM International Conference on Data Mining, SIAM, p. 243-254.

Eskin E., Arnold A., Prerau M., Portnoy L. and Stolfo S. (2002). A geometric framework for unsupervised anomaly detection. In D. Barbara and S. Jajodia (Eds): Applications of Data Mining in Computer Security, p. 78-100. Norwell: Kluwer Academic Publishers.

Examples

# sample data
data(data20)

# dissimilarity matrix calculation
prox.eskin <- eskin(data20)

# dissimilarity matrix calculation with variable weights
weights.eskin <- eskin(data20, var.weights = c(0.7, 1, 0.9, 0.5, 0))

[Package nomclust version 2.8.0 Index]