relabel {mcclust} | R Documentation |
Stephens' Relabelling Algorithm for Clusterings
Description
For a sample of clusterings in which corresponding clusters have different labels the algorithm attempts to bring the clusterings to a unique labelling.
Usage
relabel(cls, print.loss = TRUE)
Arguments
cls |
a matrix in which every row corresponds to a clustering of the |
print.loss |
logical, should current value of loss function be printed after each iteration? Defaults to TRUE. |
Details
The algorithm minimizes the loss function
\sum_{m=1}^M\sum_{i=1}^n\sum_{j=1}^K-\log\hat{p}_{ij} \cdot I_{\{z_i^{(m)}=j\}}
over the M
clusterings, n
observations and K
clusters, where \hat{p}_{ij}
is the
estimated probability that observation i
belongs to cluster j
and z_i^{(m)}
indicates to which cluster
observation i
belongs in clustering m
. I_{\{.\}}
is an indicator function.
Minimization is achieved by iterating the estimation of \hat{p}_{ij}
over all clusterings and the
minimization of the loss function in each clustering by permuting the cluster labels. The latter is
done by linear programming.
Value
cls |
the input |
P |
an |
loss.val |
value of the loss function. |
cl |
vector of cluster memberships that have the highest probabilities |
Warning
The algorithm assumes that the number of clusters K
is fixed. If this is not the case
K
is taken to be the most common number of clusters. Clusterings with other numbers of clusters are discarded
and a warning is issued.
Note
The implementation is a variant of the algorithm of Stephens which is originally applied to draws of parameters for each observation, not to cluster labels.
Author(s)
Arno Fritsch, arno.fritsch@tu-dortmund.de
References
Stephens, M. (2000) Dealing with label switching in mixture models. Journal of the Royal Statistical Society Series B, 62, 795–809.
See Also
lp.transport
for the linear programming, maxpear
, minbinder
, medv
for other possibilities of processing a sample of clusterings.
Examples
(cls <- rbind(c(1,1,2,2),c(1,1,2,2),c(1,2,2,2),c(2,2,1,1)))
# group 2 in clustering 4 corresponds to group 1 in clustering 1-3.
cls.relab <- relabel(cls)
cls.relab$cls