relabel {mcclust} | R Documentation |
Stephens' Relabelling Algorithm for Clusterings
Description
For a sample of clusterings in which corresponding clusters have different labels the algorithm attempts to bring the clusterings to a unique labelling.
Usage
relabel(cls, print.loss = TRUE)
Arguments
cls |
a matrix in which every row corresponds to a clustering of the |
print.loss |
logical, should current value of loss function be printed after each iteration? Defaults to TRUE. |
Details
The algorithm minimizes the loss function
over the clusterings,
observations and
clusters, where
is the
estimated probability that observation
belongs to cluster
and
indicates to which cluster
observation
belongs in clustering
.
is an indicator function.
Minimization is achieved by iterating the estimation of over all clusterings and the
minimization of the loss function in each clustering by permuting the cluster labels. The latter is
done by linear programming.
Value
cls |
the input |
P |
an |
loss.val |
value of the loss function. |
cl |
vector of cluster memberships that have the highest probabilities |
Warning
The algorithm assumes that the number of clusters is fixed. If this is not the case
is taken to be the most common number of clusters. Clusterings with other numbers of clusters are discarded
and a warning is issued.
Note
The implementation is a variant of the algorithm of Stephens which is originally applied to draws of parameters for each observation, not to cluster labels.
Author(s)
Arno Fritsch, arno.fritsch@tu-dortmund.de
References
Stephens, M. (2000) Dealing with label switching in mixture models. Journal of the Royal Statistical Society Series B, 62, 795–809.
See Also
lp.transport
for the linear programming, maxpear
, minbinder
, medv
for other possibilities of processing a sample of clusterings.
Examples
(cls <- rbind(c(1,1,2,2),c(1,1,2,2),c(1,2,2,2),c(2,2,1,1)))
# group 2 in clustering 4 corresponds to group 1 in clustering 1-3.
cls.relab <- relabel(cls)
cls.relab$cls