sm {nomclust}R Documentation

Simple Matching Coefficient (SM)

Description

The function calculates a dissimilarity matrix based on the SM similarity measure.

Usage

sm(data, var.weights = NULL)

Arguments

data

A data.frame or a matrix with cases in rows and variables in columns.

var.weights

A numeric vector setting weights to the used variables. One can choose the real numbers from zero to one.

Details

The simple matching coefficient (Sokal, 1958) represents the simplest way of measuring similarity. It does not impose any weights. By a given variable, it assigns the value 1 in case of match and value 0 otherwise.

Value

The function returns an object of the class "dist".

Author(s)

Zdenek Sulc.
Contact: zdenek.sulc@vse.cz

References

Boriah S., Chandola V., Kumar V. (2008). Similarity measures for categorical data: A comparative evaluation. In: Proceedings of the 8th SIAM International Conference on Data Mining, SIAM, p. 243-254.

Sokal R., Michener C. (1958). A statistical method for evaluating systematic relationships. In: Science bulletin, 38(22), The University of Kansas.

See Also

anderberg, burnaby, eskin, gambaryan, goodall1, goodall2, goodall3, goodall4, iof, lin, lin1, of, smirnov, ve, vm.

Examples

# sample data
data(data20)

# dissimilarity matrix calculation
prox.sm <- sm(data20)

# dissimilarity matrix calculation with variable weights
weights.sm <- sm(data20, var.weights = c(0.7, 1, 0.9, 0.5, 0))


[Package nomclust version 2.8.0 Index]