cluster_performance {PPCI} | R Documentation |
External Cluster Validity Metrics
Description
Computes four popular external cluster validity metrics (adjusted Rand index, purity, V-measure and Normalised Mutual Information) through comparison of cluster assignments and true class labels.
Usage
cluster_performance(assigned, labels, beta)
Arguments
assigned |
a vector of cluster assignments made by a clustering algorithm. |
labels |
a vector of true class labels to be compared with assigned. |
beta |
(optional) positive numeric, used in the computation of V-measure. larger values apply higher weight to homogeneity over completeness measures. if omitted then beta = 1 (equal weight applied to both measures). |
Value
a vector containing the four evaluation metrics listed in the description.
References
Zhao Y., Karypis G. (2004) Empirical and Theoretical Comparisons of Selected Criterion Functions for Document Clustering. Machine Learning, 55(3), 311–331.
Strehl A., Ghosh J. (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 3, 583–617.
Rosenberg A., Hirschberg J. (2007) V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure. EMNLP-CoNLL, 7, 410–420. Citeseer.
Hubert, L., Arabie, P. (1985) Comparing Partitions. Journal of Classification, 2(1), 193–218.
Examples
## load dermatology dataset
data(dermatology)
## obtain clustering solution using MCDC
sol <- mcdc(dermatology$x, 6)
## evaluate solution using external cluster validity measures
cluster_performance(sol$cluster, dermatology$c)