R: Davies–Bouldin (DB) and DB* (DBs) indexes

DB.IDX {UniversalCVI}

R Documentation

Davies–Bouldin (DB) and DB* (DBs) indexes

Description

Computes the DB (D. L. Davies and D. W. Bouldin, 1979) and DBs (M. Kim and R. S. Ramakrishna, 2005) indexes for a result either kmeans or hierarchical clustering from user specified kmin to kmax.

Usage

DB.IDX(x, kmax, kmin = 2, method = "kmeans",
  indexlist = "all", p = 2, q = 2, nstart = 100)

Arguments

`x`	a numeric data frame or matrix where each column is a variable to be used for cluster analysis and each row is a data point.
`kmax`	a maximum number of clusters to be considered.
`kmin`	a minimum number of clusters to be considered. The default is `2`.
`method`	a character string indicating which clustering method to be used (`"kmeans"`, `"hclust_complete"`, `"hclust_average"`, `"hclust_single"`). The default is `"kmeans"`.
`indexlist`	a character string indicating which cluster validity indexes to be computed (`"all"`, `"DB"`, `"DBs"`). More than one indexes can be selected.
`p`	the power of the Minkowski distance between centroids of clusters. The default is `2`.
`q`	the power of dispersion measure of a cluster. The default is `2`.
`nstart`	a maximum number of initial random sets for kmeans for `method = "kmeans"`. The default is `100`.

Details

The lowest value of DB(k),DBs(k) indicates a valid optimal partition.

Value

`DB`	the DB index for `k` from `kmin` to `kmax` shown in a data frame where the first and the second columns are `k` and the DB index, respectively.
`DBs`	the DBs index for `k` from `kmin` to `kmax` shown in a data frame where the first and the second columns are `k` and the DBs index, respectively.

Author(s)

Nathakhun Wiroonsri and Onthada Preedasawakul

References

D. L. Davies, D. W. Bouldin, "A cluster separation measure," IEEE Trans Pattern Anal Machine Intell, 1, 224-227 (1979).

M. Kim, R. S. Ramakrishna, "New indices for cluster validity assessment," Pattern Recognition Letters, 26, 2353-2363 (2005).

Examples


library(UniversalCVI)

# The data is from Wiroonsri (2024).
x = R1_data[,1:2]

# ---- Kmeans ----

# Compute all the indices by DB.IDX
K.ALL = DB.IDX(scale(x), kmax = 15, kmin = 2, method = "kmeans",
  indexlist = "all", p = 2, q = 2, nstart = 100)
print(K.ALL)

# Compute DB index
K.DB = DB.IDX(scale(x), kmax = 15, kmin = 2, method = "kmeans",
  indexlist = "DB", p = 2, q = 2, nstart = 100)
print(K.DB)

# ---- Hierarchical ----

# Average linkage

# Compute all the indices by DB.IDX
H.ALL = DB.IDX(scale(x), kmax = 15, kmin = 2, method = "hclust_average",
  indexlist = "all", p = 2, q = 2)
print(H.ALL)

# Compute DB index
H.DB = DB.IDX(scale(x), kmax = 15, kmin = 2, method = "hclust_average",
  indexlist = "DB", p = 2, q = 2)
print(H.DB)

[Package UniversalCVI version 1.1.2 Index]