calcDaviesBouldin {geocmeans}R Documentation

Davies-Bouldin index

Description

Calculate the Davies-Bouldin index of clustering quality.

Usage

calcDaviesBouldin(data, belongmatrix, centers)

Arguments

data

The original dataframe used for the clustering (n*p)

belongmatrix

A membership matrix (n*k)

centers

The centres of the clusters

Details

The Davies-Bouldin index (Da Silva et al. 2020) can be seen as the ratio of the within cluster dispersion and the between cluster separation. A lower value indicates a higher cluster compacity or a higher cluster separation. The formula is:

DB = \frac{1}{k}\sum_{i=1}^k{R_{i}}

with:

R_{i} =\max_{i \neq j}\left(\frac{S_{i}+S_{j}}{M_{i, j}}\right)

S_{l} =\left[\frac{1}{n_{l}} \sum_{l=1}^{n}\left\|\boldsymbol{x_{l}}-\boldsymbol{c_{i}}\right\|*u_{i}\right]^{\frac{1}{2}}

M_{i, j} =\sum\left\|\boldsymbol{c}_{i}-\boldsymbol{c}_{j}\right\|

So, the value of the index is an average of R_{i} values. For each cluster, they represent its worst comparison with all the other clusters, calculated as the ratio between the compactness of the two clusters and the separation of the two clusters.

Value

A float: the Davies-Bouldin index

References

Da Silva LEB, Melton NM, Wunsch DC (2020). “Incremental cluster validity indices for online learning of hard partitions: Extensions and comparative study.” IEEE Access, 8, 22025–22047.

Examples

data(LyonIris)
AnalysisFields <-c("Lden","NO2","PM25","VegHautPrt","Pct0_14","Pct_65","Pct_Img",
"TxChom1564","Pct_brevet","NivVieMed")
dataset <- sf::st_drop_geometry(LyonIris[AnalysisFields])
queen <- spdep::poly2nb(LyonIris,queen=TRUE)
Wqueen <- spdep::nb2listw(queen,style="W")
result <- SFCMeans(dataset, Wqueen,k = 5, m = 1.5, alpha = 1.5, standardize = TRUE)
calcDaviesBouldin(result$Data, result$Belongings, result$Centers)

[Package geocmeans version 0.3.4 Index]