label.switching {label.switching} | R Documentation |
Main calling function
Description
This is the main function of the package. It is used to reorder a simulated MCMC sample of the parameters of a mixture (or more general a hidden Markov model) according to eight label switching solving methods: ECR algorithm (default version), ECR algorithm (two iterative versions), PRA algorithm, Stephens' algorithm, Artificial Identifiability Constraint (AIC), Data-Based relabelling and a probabilistic relabelling algorithm (SJW). The input depends on the type of the label switching method. The output contains a list with the permutation returned by each method, the corresponding single best clusterings and the CPU time demanded for each method. In what follows: m
denotes the number of MCMC iterations, n
denotes the sample size of the observed data, K
denotes the number of mixture components and J
the number of different types of parameters of the model.
Usage
label.switching(method, zpivot, z, K, prapivot, p, complete,
mcmc, sjwinit, data, constraint,
groundTruth, thrECR, thrSTE, thrSJW,
maxECR, maxSTE, maxSJW, userPerm)
Arguments
method |
any non-empty subset of c("ECR","ECR-ITERATIVE-1","PRA","ECR-ITERATIVE-2","STEPHENS","SJW","AIC","DATA-BASED") indicating the desired label-switching solving method. Also available is the option "USER-PERM" which corresponds to a user-defined set of permutations |
zpivot |
|
z |
|
K |
the number of mixture components. This is demanded by the |
prapivot |
|
p |
|
complete |
function that returns the complete log-likelihood of the mixture model. Demanded by |
mcmc |
|
sjwinit |
An index pointing at the MCMC iteration whose parameters will initialize the |
data |
|
constraint |
An (optional) integer between 1 and J corresponding to the parameter that will be used to apply the Identifiabiality Constraint. In this casethe mcmc output is reordered according to the constraint |
groundTruth |
Optional integer vector of |
thrECR |
An (optional) positive number controlling the convergence criterion for |
thrSTE |
An (optional) positive number controlling the convergence criterion for |
thrSJW |
An (optional) positive number controlling the convergence criterion for |
maxECR |
An (optional) integer controlling the max number of iterations for |
maxSTE |
An (optional) integer controlling the max number of iterations for |
maxSJW |
An (optional) integer controlling the max number of iterations for |
userPerm |
An (optional) list with user-defined permutations. It is required only if "USER-PERM" has been chosen in |
Value
permutations |
an |
clusters |
an |
timings |
CPU time needed for each relabelling method. |
similarity |
correlation matrix between the label switching solving methods in terms of their matching best-clustering allocations. |
Note
If the ground truth is not given, all methods are reordered using the estimated single best clustering of the first provided method. The methods sjw
and pra
are not suggested for large number of components. Also note that sjw
might be quite slow even for small number of components. In this case try adjusting thrSJW
or maxSJW
to smaller values the default ones.
Author(s)
Panagiotis Papastamoulis
References
Papastamoulis P. (2016). label.switching: An R Package for Dealing with the Label Switching Problem in MCMC Outputs. Journal of Statistical Software, Code Snippets, 69(1): 1-24.
See Also
ecr
, ecr.iterative.1
, ecr.iterative.2
, stephens
, pra
, sjw
, dataBased
, aic
Examples
# We will apply the following methods:
# ECR, ECR-ITERATIVE-1, PRA, AIC and DATA-BASED.
# default ECR will use two different pivots.
#load a toy example: MCMC output consists of the random beta model
# applied to a normal mixture of \code{K=2} components. The number of
# observations is equal to \code{n=5}. The number of MCMC samples is
# equal to \code{m=300}. simulated allocations are stored to array \code{z}.
data("mcmc_output")
mcmc.pars<-data_list$"mcmc.pars"
# mcmc parameters are stored to array \code{mcmc.pars}
# mcmc.pars[,,1]: simulated means of the two components
# mcmc.pars[,,2]: simulated variances
# mcmc.pars[,,3]: simulated weights
# We will use two pivots for default ECR algorithm:
# the first one corresponds to iteration \code{mapindex} (complete MAP)
# the second one corresponds to iteration \code{mapindex.non} (observed MAP)
z<-data_list$"z"
K<-data_list$"K"
x<-data_list$"x"
mapindex<-data_list$"mapindex"
mapindex.non<-data_list$"mapindex.non"
# The PRA method will use as pivot the iteration that corresponds to
# the observed MAP estimate (mapindex).
#Apply (a subset of the available) methods by typing:
ls<-label.switching(method=c("ECR","ECR-ITERATIVE-1","PRA", "AIC","DATA-BASED"),
zpivot=z[c(mapindex,mapindex.non),],z = z,K = K, data = x,
prapivot = mcmc.pars[mapindex,,],mcmc = mcmc.pars)
#plot the raw and reordered means of the K=2 normal mixture components for each method
par(mfrow = c(2,4))
#raw MCMC output for the means (with label switching)
matplot(mcmc.pars[,,1],type="l",
xlab="iteration",main="Raw MCMC output",ylab = "means")
# Reordered outputs
matplot(permute.mcmc(mcmc.pars,ls$permutations$"ECR-1")$output[,,1],type="l",
xlab="iteration",main="ECR (1st pivot)",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"ECR-2")$output[,,1],type="l",
xlab="iteration",main="ECR (2nd pivot)",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"ECR-ITERATIVE-1")$output[,,1],
type="l",xlab="iteration",main="ECR-iterative-1",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"PRA")$output[,,1],type="l",
xlab="iteration",main="PRA",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"AIC")$output[,,1],type="l",
xlab="iteration",main="AIC",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"DATA-BASED")$output[,,1],type="l",
xlab="iteration",main="DATA-BASED",ylab = "means")
#######################################################
# if the useR wants to apply the STEPHENS and SJW algorithm as well:
# The STEPHENS method requires the classification probabilities
p<-data_list$"p"
# The SJW method needs to define the complete log-likelihood of the
# model. For the univariate normal mixture, this is done as follows:
complete.normal.loglikelihood<-function(x,z,pars){
#x: denotes the n data points
#z: denotes an allocation vector (size=n)
#pars: K\times 3 vector of means,variance, weights
# pars[k,1]: corresponds to the mean of component k
# pars[k,2]: corresponds to the variance of component k
# pars[k,3]: corresponds to the weight of component k
g <- dim(pars)[1]
n <- length(x)
logl<- rep(0, n)
logpi <- log(pars[,3])
mean <- pars[,1]
sigma <- sqrt(pars[,2])
logl<-logpi[z] + dnorm(x,mean = mean[z],sd = sigma[z],log = T)
return(sum(logl))
}
# and then run (after removing all #):
#ls<-label.switching(method=c("ECR","ECR-ITERATIVE-1","ECR-ITERATIVE-2",
#"PRA","STEPHENS","SJW","AIC","DATA-BASED"),
#zpivot=z[c(mapindex,mapindex.non),],z = z,
#K = K,prapivot = mcmc.pars[mapindex,,],p=p,
#complete = complete.normal.loglikelihood,mcmc.pars,
#data = x)