gensilwidth {optpart} | R Documentation |
Generalized Silhouette Width
Description
Calculates mean cluster silhouette widths using a generalized mean.
Usage
gensilwidth(clust, dist, p=1)
Arguments
clust |
an integer vector of cluster memberships or a classification object of class ‘clustering’ |
dist |
an object of class ‘dist’ |
p |
the scaling parameter of the analysis |
Details
gensilwidth calculates mean cluster silhouette widths using a generalized
mean. The scaling parameter can be set between [-\infty,\infty]
where values
less than one emphasize connectivity, and values greater than one emphasize
compactedness. Individual sample unit silhouette widths are still calculated as
s _i = (b_i - a_i) / \max(b_i,a_i)
where a_i
is the mean dissimilarity of a
sample unit to the cluster to which it is assigned, and b_i
is the mean
dissimilarity to the nearest neighbor cluster. Given s_i
for all members of a cluster,
the generalized mean is calculated as
\bar s = \left( {1\over n} \sum_{k=1}^n s_k^p \right)^{1/p}
Exceptions exist for specific values:
for p=0
s_i = \left( \prod_{k=1}^n s_k \right)^{1/n}
for p=-\infty
s_i = \min_{k=1}^n s_k
for p=\infty
s_i = \max_{k=1}^n s_k
p=-1
= harmonic mean, p=0
= geometric mean, and p=1
= arithmetic mean.
Value
an object of class ‘silhouette’, a list with components
cluster |
the assigned cluster for each sample unit |
neighbor |
the identity of the nearest neighbor cluster for each sample unit |
sil_width |
the silhouette width for each sample unit |
Author(s)
Attila Lengyel and Zoltan Botta-Dukat wrote the algorithm
David W. Roberts droberts@montana.edu http://ecology.msu.montana.edu/labdsv/R
References
Lengyel, A. and Z. Botta-Dukat. 2019. Silhouette width using generalized mean: A flexible method for assessing clustering efficiency. Ecology and Evolution https://doi.org/10.1002/ece3.5774
See Also
Examples
data(shoshveg)
dis.bc <- dsvdis(shoshveg,'bray')
opt.5 <- optpart(5,dis.bc)
gensilwidth(opt.5,dis.bc)