R: Starczewski and Pakhira-Bandyopadhyay-Maulik for crisp...

STRPBM.IDX {UniversalCVI}

R Documentation

Starczewski and Pakhira-Bandyopadhyay-Maulik for crisp clustering indexes

Description

Computes the STR (A. Starczewski, 2017) and PBM (M. K. Pakhira et al., 2004) indexes for a result either kmeans or hierarchical clustering from user specified kmin to kmax.

Usage

STRPBM.IDX(x, kmax, kmin = 2, method = "kmeans", indexlist = "all", nstart = 100)

Arguments

`x`	a numeric data frame or matrix where each column is a variable to be used for cluster analysis and each row is a data point.
`kmax`	a maximum number of clusters to be considered.
`kmin`	a minimum number of clusters to be considered. The default is `2`.
`method`	a character string indicating which clustering method to be used (`"kmeans"`, `"hclust_complete"`, `"hclust_average"`, `"hclust_single"`). The default is `"kmeans"`.
`indexlist`	a character string indicating which cluster validity indexes to be computed (`"all"`, `"STR"`, `"PBM"`). More than one indexes can be selected.
`nstart`	a maximum number of initial random sets for kmeans for `method = "kmeans"`. The default is `100`.

Details

PBM index can be used with both crisp and fuzzy clustering algorithms.
The largest value of STR(k) indicates a valid optimal partition.
The largest value of PBM(k) indicates a valid optimal partition.

Value

`STR`	the STR index for `k` from `kmin` to `kmax` shown in a data frame where the first and the second columns are `k` and the STR index, respectively.
`PBM`	the PBM index for `k` from `kmin` to `kmax` shown in a data frame where the first and the second columns are `k` and the PBM index, respectively.

Author(s)

Nathakhun Wiroonsri and Onthada Preedasawakul

References

M. K. Pakhira, S. Bandyopadhyay and U. Maulik, "Validity index for crisp and fuzzy clusters," Pattern Recogn 37(3):487–501 (2004).

A. Starczewski, "A new validity index for crisp clusters," Pattern Anal Applic 20, 687–700 (2017).

Examples


library(UniversalCVI)

# The data is from Wiroonsri (2024).
x = R1_data[,1:2]

# ---- Kmeans ----

# Compute all the indices by STRPBM.IDX
K.ALL = STRPBM.IDX(scale(x), kmax = 15, kmin = 2, method = "kmeans",
  indexlist = "all", nstart = 100)
print(K.ALL)

# Compute STR index
K.STR = STRPBM.IDX(scale(x), kmax = 15, kmin = 2, method = "kmeans",
  indexlist = "STR", nstart = 100)
print(K.STR)

# ---- Hierarchical ----

# Average linkage

# Compute all the indices by STRPBM.IDX
H.ALL = STRPBM.IDX(scale(x), kmax = 15, kmin = 2, method = "hclust_average",
  indexlist = "all")
print(H.ALL)

# Compute STR index
H.STR = STRPBM.IDX(scale(x), kmax = 15, kmin = 2, method = "hclust_average",
  indexlist = "STR")
print(H.STR)

[Package UniversalCVI version 1.1.2 Index]