distcritmulti {fpc} | R Documentation |
Distance based validity criteria for large data sets
Description
Approximates average silhouette width or the Pearson version of Hubert's gamma criterion by hacking the dataset into pieces and averaging the subset-wise values, see Hennig and Liao (2013).
Usage
distcritmulti(x,clustering,part=NULL,ns=10,criterion="asw",
fun="dist",metric="euclidean",
count=FALSE,seed=NULL,...)
Arguments
x |
cases times variables data matrix. |
clustering |
vector of integers indicating the clustering. |
part |
vector of integer subset sizes; sum should be smaller or
equal to the number of cases of |
ns |
integer. Number of subsets, only used if |
criterion |
|
fun |
|
metric |
passed on to |
count |
logical. if |
seed |
integer, random seed. (If |
... |
Value
A list with components crit.overall,crit.sub,crit.sd,part
.
crit.overall |
value of criterion. |
crit.sub |
vector of subset-wise criterion values. |
crit.sd |
standard deviation of |
subsets |
list of case indexes in subsets. |
Author(s)
Christian Hennig christian.hennig@unibo.it https://www.unibo.it/sitoweb/christian.hennig/en
References
Halkidi, M., Batistakis, Y., Vazirgiannis, M. (2001) On Clustering Validation Techniques, Journal of Intelligent Information Systems, 17, 107-145.
Hennig, C. and Liao, T. (2013) How to find an appropriate clustering for mixed-type variables with application to socio-economic stratification, Journal of the Royal Statistical Society, Series C Applied Statistics, 62, 309-369.
Kaufman, L. and Rousseeuw, P.J. (1990). "Finding Groups in Data: An Introduction to Cluster Analysis". Wiley, New York.
See Also
Examples
set.seed(20000)
options(digits=3)
face <- rFace(50,dMoNo=2,dNoEy=0,p=2)
clustering <- as.integer(attr(face,"grouping"))
distcritmulti(face,clustering,ns=3,seed=100000,criterion="pearsongamma")