label.switching {label.switching}R Documentation

Main calling function

Description

This is the main function of the package. It is used to reorder a simulated MCMC sample of the parameters of a mixture (or more general a hidden Markov model) according to eight label switching solving methods: ECR algorithm (default version), ECR algorithm (two iterative versions), PRA algorithm, Stephens' algorithm, Artificial Identifiability Constraint (AIC), Data-Based relabelling and a probabilistic relabelling algorithm (SJW). The input depends on the type of the label switching method. The output contains a list with the permutation returned by each method, the corresponding single best clusterings and the CPU time demanded for each method. In what follows: m denotes the number of MCMC iterations, n denotes the sample size of the observed data, K denotes the number of mixture components and J the number of different types of parameters of the model.

Usage

label.switching(method, zpivot, z, K, prapivot, p, complete, 
			mcmc, sjwinit, data, constraint, 
			groundTruth, thrECR, thrSTE, thrSJW, 
			maxECR, maxSTE, maxSJW, userPerm)

Arguments

method

any non-empty subset of c("ECR","ECR-ITERATIVE-1","PRA","ECR-ITERATIVE-2","STEPHENS","SJW","AIC","DATA-BASED") indicating the desired label-switching solving method. Also available is the option "USER-PERM" which corresponds to a user-defined set of permutations userPerm.

zpivot

d\times n-dimensional array of pivot allocation vectors, where d denotes the number of pivots. This is demanded by the ecr method. The method will be applied d times.

z

m\times n integer array of the latent allocation vectors generated from an MCMC algorithm.

K

the number of mixture components. This is demanded by the ecr, ecr.iterative.1 and ecr.iterative.2 methods.

prapivot

K\times J array containing the parameter that will be used as a pivot by the pra method.

p

m\times n \times K dimensional array of allocation probabilities of the n observations among the K mixture components, for each iteration t = 1,\ldots,m of the MCMC algorithm. This is demanded by the ecr.iterative.2 and stephens methods.

complete

function that returns the complete log-likelihood of the mixture model. Demanded by sjw method.

mcmc

m\times K\times J array of simulated MCMC parameters. Needed by sjw and pra methods.

sjwinit

An index pointing at the MCMC iteration whose parameters will initialize the sjw algorithm (optional).

data

n-dimensional data vector/array. Needed by the sjw and dataBased algorithms.

constraint

An (optional) integer between 1 and J corresponding to the parameter that will be used to apply the Identifiabiality Constraint. In this casethe mcmc output is reordered according to the constraint mcmc[i,1,constraint] < \ldots < mcmc[i,K,constraint]. If constraint = "ALL", all J Identifiability Constraints are applied. Default value: 1.

groundTruth

Optional integer vector of n allocations, which are considered as the 'ground truth' allocations of the n observations among the K mixture components. The output of all methods will be relabelled in a way that the resulting single best clusterings maximize their similarity with the ground truth. This option is very useful in simulation studies or in any other case that the cluster labels are known in order to perform comparisons between methods.

thrECR

An (optional) positive number controlling the convergence criterion for ecr.iterative.1 and ecr.iterative.2. Default value: 1e-6.

thrSTE

An (optional) positive number controlling the convergence criterion for stephens. Default value: 1e-6.

thrSJW

An (optional) positive number controlling the convergence criterion for sjw. Default value: 1e-6.

maxECR

An (optional) integer controlling the max number of iterations for ecr.iterative.1 and ecr.iterative.2. Default value: 100.

maxSTE

An (optional) integer controlling the max number of iterations for stephens. Default value: 100.

maxSJW

An (optional) integer controlling the max number of iterations for sjw. Default value: 100.

userPerm

An (optional) list with user-defined permutations. It is required only if "USER-PERM" has been chosen in method. In this case, userPerm[[i]] is an m\times K array of permutations for all i = 1,\ldots,S, where S denotes the number of permutation arrays. This is useful in case that the user wants to compare his/hers own relabelling method with the available ones.

Value

permutations

an m\times K array of permutations per method.

clusters

an n dimensional vector of best clustering of the the observations for each method.

timings

CPU time needed for each relabelling method.

similarity

correlation matrix between the label switching solving methods in terms of their matching best-clustering allocations.

Note

If the ground truth is not given, all methods are reordered using the estimated single best clustering of the first provided method. The methods sjw and pra are not suggested for large number of components. Also note that sjw might be quite slow even for small number of components. In this case try adjusting thrSJW or maxSJW to smaller values the default ones.

Author(s)

Panagiotis Papastamoulis

References

Papastamoulis P. (2016). label.switching: An R Package for Dealing with the Label Switching Problem in MCMC Outputs. Journal of Statistical Software, Code Snippets, 69(1): 1-24.

See Also

ecr, ecr.iterative.1, ecr.iterative.2, stephens, pra, sjw, dataBased, aic

Examples

# We will apply the following methods:
# ECR, ECR-ITERATIVE-1, PRA, AIC and DATA-BASED.
# default ECR will use two different pivots.

#load a toy example: MCMC output consists of the random beta model
# applied to a normal mixture of \code{K=2} components. The number of
# observations is equal to \code{n=5}. The number of MCMC samples is
# equal to \code{m=300}. simulated allocations are stored to array \code{z}. 
data("mcmc_output")
mcmc.pars<-data_list$"mcmc.pars"
# mcmc parameters are stored to array \code{mcmc.pars}
# mcmc.pars[,,1]: simulated means of the two components
# mcmc.pars[,,2]: simulated variances 
# mcmc.pars[,,3]: simulated weights 
# We will use two pivots for default ECR algorithm:
# the first one corresponds to iteration \code{mapindex} (complete MAP)
# the second one corresponds to iteration \code{mapindex.non} (observed MAP)

z<-data_list$"z"
K<-data_list$"K"
x<-data_list$"x"
mapindex<-data_list$"mapindex"
mapindex.non<-data_list$"mapindex.non"
# The PRA method will use as pivot the iteration that corresponds to
# the observed MAP estimate (mapindex). 

#Apply (a subset of the available) methods by typing:

ls<-label.switching(method=c("ECR","ECR-ITERATIVE-1","PRA", "AIC","DATA-BASED"),
zpivot=z[c(mapindex,mapindex.non),],z = z,K = K, data = x,
prapivot = mcmc.pars[mapindex,,],mcmc = mcmc.pars)

#plot the raw and reordered means of the K=2 normal mixture components for each method
par(mfrow = c(2,4))
#raw MCMC output for the means (with label switching)
matplot(mcmc.pars[,,1],type="l",
xlab="iteration",main="Raw MCMC output",ylab = "means")
# Reordered outputs
matplot(permute.mcmc(mcmc.pars,ls$permutations$"ECR-1")$output[,,1],type="l",
xlab="iteration",main="ECR (1st pivot)",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"ECR-2")$output[,,1],type="l",
xlab="iteration",main="ECR (2nd pivot)",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"ECR-ITERATIVE-1")$output[,,1],
type="l",xlab="iteration",main="ECR-iterative-1",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"PRA")$output[,,1],type="l",
xlab="iteration",main="PRA",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"AIC")$output[,,1],type="l",
xlab="iteration",main="AIC",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"DATA-BASED")$output[,,1],type="l",
xlab="iteration",main="DATA-BASED",ylab = "means")

#######################################################
# if the useR wants to apply the STEPHENS and SJW algorithm as well:
# The STEPHENS method requires the classification probabilities
p<-data_list$"p"

# The SJW method needs to define the complete log-likelihood of the
# model. For the univariate normal mixture, this is done as follows:

complete.normal.loglikelihood<-function(x,z,pars){
	#x: denotes the n data points
	#z: denotes an allocation vector (size=n)
	#pars: K\times 3 vector of means,variance, weights
	# pars[k,1]: corresponds to the mean of component k
	# pars[k,2]: corresponds to the variance of component k
	# pars[k,3]: corresponds to the weight of component k
	g <- dim(pars)[1]
	n <- length(x)
	logl<- rep(0, n)
 	logpi <- log(pars[,3])
	mean <- pars[,1]
	sigma <- sqrt(pars[,2])
	logl<-logpi[z] + dnorm(x,mean = mean[z],sd = sigma[z],log = T)
	return(sum(logl))
}

# and then run (after removing all #):
#ls<-label.switching(method=c("ECR","ECR-ITERATIVE-1","ECR-ITERATIVE-2",
#"PRA","STEPHENS","SJW","AIC","DATA-BASED"),
#zpivot=z[c(mapindex,mapindex.non),],z = z,
#K = K,prapivot = mcmc.pars[mapindex,,],p=p,
#complete = complete.normal.loglikelihood,mcmc.pars,
#data = x)


[Package label.switching version 1.8 Index]