FzzyCVIs {UniversalCVI} | R Documentation |
Fuzzy cluster validity indexes used in Wiroonsri and Preedasawakul (2023)
Description
Computes the cluster validity indexes for a result of either FCM or EM clustering from user specified cmin
to cmax
used in Wiroonsri and Preedasawakul (2023). It includes the XB (X. L. Xie and G. Beni, 1991) index, KWON (S. H. Kwon, 1998) index, KWON2 (S. H. Kwon et al., 2021) index, TANG (Y. Tang et al., 2005) index , HF (F. Haouas et al., 2017) index, WL (C. H. Wu et al., 2015) index, PBM (M. K. Pakhira et al., 2004) index, KPBM (C. Alok, 2010) index, CCVP and CCVS (M. Popescu et al., 2013) index, GC1, GC2, GC3, and GC4 (J. C. Bezdek et al., 2016) indexes , WPC, WP, WPCI1, and, WPCI2 (N. Wiroonsri and O. Preedasawakul, 2023) indexes.
Usage
FzzyCVIs(x, cmax, cmin = 2, indexlist = 'all', corr = 'pearson',
method = 'FCM', fzm = 2, gamma = (fzm^2*7)/4, sampling = 1,
iter = 100, nstart = 20, NCstart = TRUE)
Arguments
x |
a numeric data frame or matrix where each column is a variable to be used for cluster analysis and each row is a data point. |
cmax |
a maximum number of clusters to be considered. |
cmin |
a minimum number of clusters to be considered. The default is |
indexlist |
a character string indicating which cluster validity indexes to be computed ( |
corr |
a character string indicating which correlation coefficient is to be computed ( |
method |
a character string indicating which clustering method to be used ( |
fzm |
a number greater than 1 giving the degree of fuzzification for |
gamma |
adjusted fuzziness parameter for |
sampling |
a number greater than 0 and less than or equal to 1 indicating the undersampling proportion of data to be used. This argument is intended for handling a large dataset. The default is |
iter |
a maximum number of iterations for |
nstart |
a maximum number of initial random sets for FCM for |
NCstart |
logical for |
Details
The well-known cluster validity indexes for either FCM or EM clustering. It includes the XB (X. L. Xie and G. Beni., 1991) index, KWON (S. H. Kwon, 1998) index, KWON2 (S. H. Kwon et al., 2021) index, TANG (Y. Tang et al., 2005) index , HF (F. Haouas et al., 2017) index, WL (C. H. Wu et al., 2015) index, PBM (M. K. Pakhira et al., 2004) index, KPBM (C. Alok, 2010) index, CCVP and CCVS (M. Popescu et al., 2013) index, GC1, GC2, GC3, and GC4 (J. C. Bezdek et al., 2016) indexes , WPC, WP, WPCI1, and, WPCI2 (N. Wiroonsri and O. Preedasawakul, 2023) indexes.
The WPC computes the correlation between the actual distance between a pair of data points and the distance between adjusted centroids with respect to the pair. WPCI1 and WPCI2 are the proportion and the subtraction, respectively, of the same two ratios. The first ratio is the WPC improvement from c-1
clusters to c
clusters over the entire room for improvement. The second ratio is the WPC improvement from c
clusters to c+1
clusters over the entire room for improvement. WP
is defined as a combination of WPCI1
and WPCI2
.
Value
WPC |
the WP correlation from |
Each of the followings shows the values of each index for c
from cmin
to cmax
in a data frame.
WP |
the WP index. |
WPCI1 |
the WPCI1 index. |
WPCI2 |
the WPCI2 index. |
XB |
the XB index. |
KWON |
the KWON index. |
KWON2 |
the KWON2 index. |
TANG |
the TANG index. |
HF |
the HF index. |
WL |
the WL index. |
PBM |
the PBM index |
KPBM |
the KPBM index |
CCVP |
the Pearson Correlation Cluster Validity index. |
CCVS |
the Spearman’s (rho) Correlation Cluster Validity index. |
GC1 |
the generalized C index ( |
GC2 |
the generalized C index ( |
GC3 |
the generalized C index ( |
GC4 |
the generalized C index ( |
Author(s)
Nathakhun Wiroonsri and Onthada Preedasawakul
References
C. Alok. (2010). "An investigation of clustering algorithms and soft computing approaches for pattern recognition," Department of Computer Science, Assam University.
J. C. Bezdek, M. Moshtaghi, T. Runkler, C. Leckie, “The generalized
c index for internal fuzzy cluster validity,” IEEE Transactions on Fuzzy
Systems, vol. 24, no. 6, pp. 1500–1512, 2016.
F. Haouas, Z. Ben Dhiaf, A. Hammouda, B. Solaiman, "A new efficient fuzzy cluster validity index: Application to images clustering," 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Naples, Italy, 2017, pp. 1-6.
S. H. Kwon, “Cluster validity index for fuzzy clustering,” Electronics
letters, vol. 34, no. 22, pp. 2176–2177, 1998.
S. H. Kwon, J. Kim, S. H. Son, “Improved cluster validity index
for fuzzy clustering,” Electronics Letters, vol. 57, no. 21, pp. 792–794,
2021.
M. K. Pakhira, S. Bandyopadhyay, U. Maulik, “Validity index for crisp and fuzzy clusters,” Pattern recognition, vol. 37, no. 3, pp. 487–501, 2004.
M. Popescu, J. C. Bezdek, T. C. Havens, J. M. Keller, "A Cluster Validity Framework Based on Induced Partition Dissimilarity," in IEEE Transactions on Cybernetics, vol. 43, no. 1, pp. 308-320, Feb. 2013.
Y. Tang, F. Sun, Z. Sun, “Improved validation index for fuzzy clustering,” in Proceedings of the 2005, American Control Conference, 2005., pp. 1120–1125 vol. 2, 2005.
N. Wiroonsri, O. Preedasawakul, "A correlation-based fuzzy cluster validity index with secondary options detector," arXiv:2308.14785, 2023
C. H. Wu, C. S. Ouyang, L. W. Chen, L. W. Lu, “A new
fuzzy clustering validity index with a median factor for centroid-based clustering,” IEEE Transactions on Fuzzy Systems, vol. 23, no. 3, pp. 701–718, 2015.
X. Xie, G. Beni, “A validity measure for fuzzy clustering,” IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 8,
pp. 841–847, 1991.
See Also
WP.IDX, GC.IDX, CCV.IDX, R1_data
Examples
library(UniversalCVI)
# Iris data
x = iris[,1:4]
# ---- FCM algorithm ----
# Compute selected a set of indices ("WPC","WP","XB") using default gamma
F.s = FzzyCVIs(scale(x), cmax = 10, cmin = 2, indexlist = c("WPC","WP","XB"),
corr = 'pearson', method = 'FCM', fzm = 2, iter = 100, nstart = 20, NCstart = TRUE)
# Plot the computed indexes
plot_idx(F.s)
# ---- EM algorithm ----
# Compute all the indices by FzzyCVIs using default gamma
E.all = FzzyCVIs(scale(x), cmax = 10, cmin = 2, indexlist = 'all', corr = 'pearson',
method = 'EM', iter = 100, nstart = 20, NCstart = TRUE)
# Plot the computed indexes
plot_idx(E.all)