FzzyCVIs {UniversalCVI}R Documentation

Fuzzy cluster validity indexes used in Wiroonsri and Preedasawakul (2023)

Description

Computes the cluster validity indexes for a result of either FCM or EM clustering from user specified cmin to cmax used in Wiroonsri and Preedasawakul (2023). It includes the XB (X. L. Xie and G. Beni, 1991) index, KWON (S. H. Kwon, 1998) index, KWON2 (S. H. Kwon et al., 2021) index, TANG (Y. Tang et al., 2005) index , HF (F. Haouas et al., 2017) index, WL (C. H. Wu et al., 2015) index, PBM (M. K. Pakhira et al., 2004) index, KPBM (C. Alok, 2010) index, CCVP and CCVS (M. Popescu et al., 2013) index, GC1, GC2, GC3, and GC4 (J. C. Bezdek et al., 2016) indexes , WPC, WP, WPCI1, and, WPCI2 (N. Wiroonsri and O. Preedasawakul, 2023) indexes.

Usage

FzzyCVIs(x, cmax, cmin = 2, indexlist = 'all', corr = 'pearson',
  method = 'FCM', fzm = 2, gamma = (fzm^2*7)/4, sampling = 1,
  iter = 100, nstart = 20, NCstart = TRUE)

Arguments

x

a numeric data frame or matrix where each column is a variable to be used for cluster analysis and each row is a data point.

cmax

a maximum number of clusters to be considered.

cmin

a minimum number of clusters to be considered. The default is 2.

indexlist

a character string indicating which cluster validity indexes to be computed ("all", "WPC", "WP", "WPCI1", "WPCI2", "XB", ""KWON"", ""KWON2"", ""TANG"", ""HF"", "WL", "PBM", "KPBM", "CCVP", "CCVS", "GC1", "GC2", "GC3", "GC4"). More than one indexes can be selected.

corr

a character string indicating which correlation coefficient is to be computed ("pearson", "kendall" or "spearman") for indexlist = ("WP", "WPC", "WPCI1","WPCI2", "CCVP", "CCVS", "GC1", "GC2", "GC3" or "GC4"). The default is "pearson".

method

a character string indicating which clustering method to be used ("FCM" or "EM"). The default is "FCM".

fzm

a number greater than 1 giving the degree of fuzzification for method = "FCM". The default is 2.

gamma

adjusted fuzziness parameter for indexlist = ("WP", "WPC", "WPCI1", "WPCI2"). The default is 7fzm^2/4.

sampling

a number greater than 0 and less than or equal to 1 indicating the undersampling proportion of data to be used. This argument is intended for handling a large dataset. The default is 1.

iter

a maximum number of iterations for method = "FCM". The default is 100.

nstart

a maximum number of initial random sets for FCM for method = "FCM". The default is 20.

NCstart

logical for indexlist includes either of the "WP", "WPC", "WPCI1", and "WPCI2"), if TRUE, the WP correlation at c=1 is defined as the ratio introduced in the reference. Otherwise, it is assigned as 0.

Details

The well-known cluster validity indexes for either FCM or EM clustering. It includes the XB (X. L. Xie and G. Beni., 1991) index, KWON (S. H. Kwon, 1998) index, KWON2 (S. H. Kwon et al., 2021) index, TANG (Y. Tang et al., 2005) index , HF (F. Haouas et al., 2017) index, WL (C. H. Wu et al., 2015) index, PBM (M. K. Pakhira et al., 2004) index, KPBM (C. Alok, 2010) index, CCVP and CCVS (M. Popescu et al., 2013) index, GC1, GC2, GC3, and GC4 (J. C. Bezdek et al., 2016) indexes , WPC, WP, WPCI1, and, WPCI2 (N. Wiroonsri and O. Preedasawakul, 2023) indexes.

The WPC computes the correlation between the actual distance between a pair of data points and the distance between adjusted centroids with respect to the pair. WPCI1 and WPCI2 are the proportion and the subtraction, respectively, of the same two ratios. The first ratio is the WPC improvement from c-1 clusters to c clusters over the entire room for improvement. The second ratio is the WPC improvement from c clusters to c+1 clusters over the entire room for improvement. WP is defined as a combination of WPCI1 and WPCI2.

Value

WPC

the WP correlation from c from cmin-1 to cmax+1 shown in a data frame.

Each of the followings shows the values of each index for c from cmin to cmax in a data frame.

WP

the WP index.

WPCI1

the WPCI1 index.

WPCI2

the WPCI2 index.

XB

the XB index.

KWON

the KWON index.

KWON2

the KWON2 index.

TANG

the TANG index.

HF

the HF index.

WL

the WL index.

PBM

the PBM index

KPBM

the KPBM index

CCVP

the Pearson Correlation Cluster Validity index.

CCVS

the Spearman’s (rho) Correlation Cluster Validity index.

GC1

the generalized C index (\sum\cdot \sim Sum-Product).

GC2

the generalized C index (\sum\wedge \sim Sum-Min).

GC3

the generalized C index (\vee\cdot \sim Max-Product).

GC4

the generalized C index (\vee\wedge \sim Max-Min).

Author(s)

Nathakhun Wiroonsri and Onthada Preedasawakul

References

C. Alok. (2010). "An investigation of clustering algorithms and soft computing approaches for pattern recognition," Department of Computer Science, Assam University.

J. C. Bezdek, M. Moshtaghi, T. Runkler, C. Leckie, “The generalized c index for internal fuzzy cluster validity,” IEEE Transactions on Fuzzy Systems, vol. 24, no. 6, pp. 1500–1512, 2016.

F. Haouas, Z. Ben Dhiaf, A. Hammouda, B. Solaiman, "A new efficient fuzzy cluster validity index: Application to images clustering," 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Naples, Italy, 2017, pp. 1-6.

S. H. Kwon, “Cluster validity index for fuzzy clustering,” Electronics letters, vol. 34, no. 22, pp. 2176–2177, 1998.

S. H. Kwon, J. Kim, S. H. Son, “Improved cluster validity index for fuzzy clustering,” Electronics Letters, vol. 57, no. 21, pp. 792–794, 2021.

M. K. Pakhira, S. Bandyopadhyay, U. Maulik, “Validity index for crisp and fuzzy clusters,” Pattern recognition, vol. 37, no. 3, pp. 487–501, 2004.

M. Popescu, J. C. Bezdek, T. C. Havens, J. M. Keller, "A Cluster Validity Framework Based on Induced Partition Dissimilarity," in IEEE Transactions on Cybernetics, vol. 43, no. 1, pp. 308-320, Feb. 2013.

Y. Tang, F. Sun, Z. Sun, “Improved validation index for fuzzy clustering,” in Proceedings of the 2005, American Control Conference, 2005., pp. 1120–1125 vol. 2, 2005.

N. Wiroonsri, O. Preedasawakul, "A correlation-based fuzzy cluster validity index with secondary options detector," arXiv:2308.14785, 2023

C. H. Wu, C. S. Ouyang, L. W. Chen, L. W. Lu, “A new fuzzy clustering validity index with a median factor for centroid-based clustering,” IEEE Transactions on Fuzzy Systems, vol. 23, no. 3, pp. 701–718, 2015.

X. Xie, G. Beni, “A validity measure for fuzzy clustering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 8, pp. 841–847, 1991.

See Also

WP.IDX, GC.IDX, CCV.IDX, R1_data

Examples


library(UniversalCVI)

# Iris data
x = iris[,1:4]

# ---- FCM algorithm ----


# Compute selected a set of indices ("WPC","WP","XB") using default gamma
F.s = FzzyCVIs(scale(x), cmax = 10, cmin = 2, indexlist = c("WPC","WP","XB"),
  corr = 'pearson', method = 'FCM', fzm = 2, iter = 100, nstart = 20, NCstart = TRUE)

# Plot the computed indexes
plot_idx(F.s)

# ---- EM algorithm ----

# Compute all the indices by FzzyCVIs using default gamma
E.all = FzzyCVIs(scale(x), cmax = 10, cmin = 2, indexlist = 'all', corr = 'pearson',
  method = 'EM', iter = 100, nstart = 20, NCstart = TRUE)

# Plot the computed indexes
plot_idx(E.all)


[Package UniversalCVI version 1.1.2 Index]