k.select {bootcluster} | R Documentation |
Estimate number of clusters
Description
Estimate number of clusters by bootstrapping stability
Usage
k.select(x, range = 2:7, B = 20, r = 5, threshold = 0.8, scheme_2 = TRUE)
Arguments
x |
a |
range |
a |
B |
number of bootstrap re-samplings |
r |
number of runs of k-means |
threshold |
the threshold for determining k |
scheme_2 |
|
Details
This function estimates the number of clusters through a bootstrapping approach, and a measure Smin, which is based on an observation-wise similarity among clusterings. The number of clusters k is selected as the largest number of clusters, for which the Smin is greater than a threshold. The threshold is often selected between 0.8 ~ 0.9. Two schemes are provided. Scheme 1 uses the clustering of the original data as the reference for stability calculations. Scheme 2 searches acrossthe clustering samples that gives the most stable clustering.
Value
profile
a
vector
of Smin measures for determining kk
integer
estimated number of clusters
Author(s)
Han Yu
References
Bootstrapping estimates of stability for clusters, observations and model selection. Han Yu, Brian Chapman, Arianna DiFlorio, Ellen Eischen, David Gotz, Matthews Jacob and Rachael Hageman Blair.
Examples
set.seed(1)
data(wine)
x0 <- wine[,2:14]
x <- scale(x0)
k.select(x, range = 2:10, B=20, r=5, scheme_2 = TRUE)