homogeneity {clevr} | R Documentation |
Homogeneity Between Clusterings
Description
Computes the homogeneity between two clusterings, such as a predicted and ground truth clustering.
Usage
homogeneity(true, pred)
Arguments
true |
ground truth clustering represented as a membership vector. Each entry corresponds to an element and the value identifies the assigned cluster. The specific values of the cluster identifiers are arbitrary. |
pred |
predicted clustering represented as a membership vector. |
Details
Homogeneity is an entropy-based measure of the similarity
between two clusterings, say t
and p
. The homogeneity
is high if clustering t
only assigns members of a cluster to
a single cluster in p
. The homogeneity ranges between 0
and 1, where 1 indicates a perfect homogeneity.
References
Rosenberg, A. and Hirschberg, J. "V-measure: A conditional entropy-based external cluster evaluation measure." Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), (2007).
See Also
completeness
evaluates the completeness, which is a dual
measure to homogeneity. v_measure
evaluates the harmonic mean of
completeness and homogeneity.
Examples
true <- c(1,1,1,2,2) # ground truth clustering
pred <- c(1,1,2,2,2) # predicted clustering
homogeneity(true, pred)