DI.IDX {UniversalCVI} | R Documentation |
Dunn index
Description
Computes the DI (J. C. Dunn, 1973) index for a result either kmeans or hierarchical clustering from user specified kmin
to kmax
.
Usage
DI.IDX(x, kmax, kmin = 2, method = "kmeans", nstart = 100)
Arguments
x |
a numeric data frame or matrix where each column is a variable to be used for cluster analysis and each row is a data point. |
kmax |
a maximum number of clusters to be considered. |
kmin |
a minimum number of clusters to be considered. The default is |
method |
a character string indicating which clustering method to be used ( |
nstart |
a maximum number of initial random sets for kmeans for |
Details
The DI index is defined as
DI(k) = \min_{i \ne j \in [k]}\left\{\frac{\min\left\{d(x_u,x_v)|x_u\in C_i,x_v \in C_j\right\}}{\max_{l \in [k]}\max\left\{d(x_u,x_v)|x_u,x_v \in C_l\right\}}\right\}.
The largest value of DI(k)
indicates a valid optimal partition.
Value
DI |
the DI index for |
Author(s)
Nathakhun Wiroonsri and Onthada Preedasawakul
References
J. C. Dunn, "A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters," J Cybern, 3(3), 32-57 (1973).
See Also
Hvalid, Wvalid, DB.IDX, FzzyCVIs, R1_data
Examples
library(UniversalCVI)
# The data is from Wiroonsri (2024).
x = R1_data[,1:2]
# ---- Kmeans ----
# Compute the DI index
K.DI = DI.IDX(scale(x), kmax = 15, kmin = 2, method = "kmeans", nstart = 100)
print(K.DI)
# The optimal number of cluster
K.DI[which.max(K.DI$DI),]
# ---- Hierarchical ----
# Average linkage
# Compute the DI index
H.DI = DI.IDX(scale(x), kmax = 15, kmin = 2, method = "hclust_average")
print(H.DI)
# The optimal number of cluster
H.DI[which.max(H.DI$DI),]