findClusterNumber {SillyPutty} | R Documentation |
Using SillyPutty to find the number of clusters
Description
A function that is designed to find an approximation of the true
number. K, of clusters in a dataset. the findClusterNumber
function calls RandomSillyPutty
for each value of K in the
range from start
to end
, performing N
random
starts each time.
NOTE: start must be > 1, and the function can be slow depending on how complex the dataset is and the number of N iterations.
Usage
findClusterNumber(distobj, start,end, N = 100,
method = c("SillyPutty", "HCSP"), ...)
Arguments
distobj |
An object of class |
start |
The minimum cluster number for the range of clusters |
end |
The maximum cluster number for the range of clusters |
N |
Number of iterations |
method |
whether to use the full |
... |
Extra arguments to the |
Details
The findClusterNumber
function processes one distance matrix at
a time, through N iterations. It returns a list. The list
is a
list of the maximum silhoutte width values obtained from N iterations
with their associated cluster number.
Value
A list containing the maximum silhouette width values per K clusters for each K in the range of possible cluster numbers.
Author(s)
Kevin R. Coombes krc@silicovore.com, Dwayne G. Tally dtally110@hotmail.com
References
Pending.
Examples
data(eucdist)
set.seed(12)
y <- findClusterNumber(eucdist, start = 3, end = 7, method = "HCSP")
plot(names(y), y, xlab = "K", ylab = "Mean Silhouette Width",
type = "b", lwd = 2, pch = 16)