FKM.gkb.ent {fclust} | R Documentation |
Gustafson, Kessel and Babuska - like fuzzy k-means with entropy regularization
Description
Performs the Gustafson, Kessel and Babuska - like fuzzy k-means clustering algorithm with entropy regularization.
Differently from fuzzy k-means, it is able to discover non-spherical clusters.
The Babuska et al. variant improves the computation of the fuzzy covariance matrices in the standard Gustafson and Kessel clustering algorithm.
The entropy regularization allows us to avoid using the artificial fuzziness parameter m. This is replaced by the degree of fuzzy entropy ent, related to the concept of temperature in statistical physics.
An interesting property of the fuzzy k-means with entropy regularization is that the prototypes are obtained as weighted means with weights equal to the membership degrees (rather than to the membership degrees at
the power of m as is for the fuzzy k-means).
Usage
FKM.gkb.ent (X, k, ent, vp, gam, mcn, RS, stand, startU, index, alpha, conv, maxit, seed)
Arguments
X |
Matrix or data.frame |
k |
An integer value or vector specifying the number of clusters for which the |
ent |
Degree of fuzzy entropy (default: 1) |
vp |
Volume parameter (default: rep(1,k)) |
gam |
Weighting parameter for the fuzzy covariance matrices (default: 0) |
mcn |
Maximum condition number for the fuzzy covariance matrices (default: 1e+15) |
RS |
Number of (random) starts (default: 1) |
stand |
Standardization: if |
startU |
Rational start for the membership degree matrix |
index |
Cluster validity index to select the number of clusters: |
alpha |
Weighting coefficient for the fuzzy silhouette index |
conv |
Convergence criterion (default: 1e-9) |
maxit |
Maximum number of iterations (default: 1e+2) |
seed |
Seed value for random number generation (default: NULL) |
Details
If startU
is given, the argument k
is ignored (the number of clusters is ncol(startU)
).
If startU
is given, the first element of value
, cput
and iter
refer to the rational start.
If a cluster covariance matrix becomes singular, the algorithm stops and the element of value
is NaN.
The default value for ent
is in general not reasonable if FKM.gk.ent
is run using raw data.
The update of the membership degrees requires the computation of exponential functions. In some cases, this may produce NaN
values and the algorithm stops. Such a problem is usually solved by running FKM.gk.ent
using standardized data (stand=1
).
Value
Object of class fclust
, which is a list with the following components:
U |
Membership degree matrix |
H |
Prototype matrix |
F |
Array containing the covariance matrices of all the clusters |
clus |
Matrix containing the indexes of the clusters where the objects are assigned (column 1) and the associated membership degrees (column 2) |
medoid |
Vector containing the indexes of the medoid objects ( |
value |
Vector containing the loss function values for the |
criterion |
Vector containing the values of the cluster validity index |
iter |
Vector containing the numbers of iterations for the |
k |
A integer value or vector indicating the number of clusters. (default: 2:6) |
m |
Parameter of fuzziness ( |
ent |
Degree of fuzzy entropy |
b |
Parameter of the polynomial fuzzifier ( |
vp |
Volume parameter (default: |
delta |
Noise distance ( |
gam |
Weighting parameter for the fuzzy covariance matrices |
mcn |
Maximum condition number for the fuzzy covariance matrices |
stand |
Standardization (Yes if |
Xca |
Data used in the clustering algorithm (standardized data if |
X |
Raw data |
D |
Dissimilarity matrix ( |
call |
Matched call |
Author(s)
Paolo Giordani, Maria Brigida Ferraro, Alessio Serafini
References
Babuska R., van der Veen P.J., Kaymak U., 2002. Improved covariance estimation for Gustafson-Kessel clustering. Proceedings of the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 1081-1085.
Ferraro M.B., Giordani P., 2013. A new fuzzy clustering algorithm with entropy regularization. Proceedings of the meeting on Classification and Data Analysis (CLADAG).
See Also
FKM.gk.ent
, Fclust
, Fclust.index
, print.fclust
, summary.fclust
, plot.fclust
, unemployment
Examples
## Not run:
## unemployment data
data(unemployment)
## Gustafson, Kessel and Babuska-like fuzzy k-means with entropy regularization,
##fixing the number of clusters
clust=FKM.gkb.ent(unemployment,k=3,ent=0.2,RS=10,stand=1)
## Gustafson, Kessel and Babuska-like fuzzy k-means with entropy regularization,
##selecting the number of clusters
clust=FKM.gkb.ent(unemployment,k=2:6,ent=0.2,RS=10,stand=1)
## End(Not run)