piv_MCMC {pivmet} | R Documentation |
JAGS/Stan Sampling for Gaussian Mixture Models and Clustering via Co-Association Matrix.
Description
Perform MCMC JAGS sampling or HMC Stan sampling for Gaussian mixture models, post-process the chains and apply a clustering technique to the MCMC sample. Pivotal units for each group are selected among four alternative criteria.
Usage
piv_MCMC(
y,
k,
nMC,
priors,
piv.criterion = c("MUS", "maxsumint", "minsumnoint", "maxsumdiff"),
clustering = c("diana", "hclust"),
software = c("rjags", "rstan"),
burn = 0.5 * nMC,
chains = 4,
cores = 1,
sparsity = FALSE
)
Arguments
y |
|
k |
Number of mixture components. |
nMC |
Number of MCMC iterations for the JAGS/Stan function execution. |
priors |
Input prior hyperparameters (see Details for default options). |
piv.criterion |
The pivotal criterion used for identifying one pivot
for each group. Possible choices are: |
clustering |
The algorithm adopted for partitioning the
|
software |
The selected MCMC method to fit the model: |
burn |
The burn-in period (only if method |
chains |
A positive integer specifying the number of Markov chains. The default is 4. |
cores |
The number of cores to use when executing the Markov chains in parallel (only if
|
sparsity |
Allows for sparse finite mixtures, default is |
Details
The function fits univariate and multivariate Bayesian Gaussian mixture models of the form (here for univariate only):
(Y_i|Z_i=j) \sim \mathcal{N}(\mu_j,\sigma_j),
where the Z_i
, i=1,\ldots,N
, are i.i.d. random variables, j=1,\dots,k
,
\sigma_j
is the group variance, Z_i \in {1,\ldots,k }
are the
latent group allocation, and
P(Z_i=j)=\eta_j.
The likelihood of the model is then
L(y;\mu,\eta,\sigma) = \prod_{i=1}^N \sum_{j=1}^k \eta_j \mathcal{N}(\mu_j,\sigma_j),
where (\mu, \sigma)=(\mu_{1},\dots,\mu_{k},\sigma_{1},\ldots,\sigma_{k})
are the component-specific parameters and \eta=(\eta_{1},\dots,\eta_{k})
the mixture weights. Let \nu
denote a permutation of { 1,\ldots,k }
,
and let \nu(\mu)= (\mu_{\nu(1)},\ldots,
\mu_{\nu(k)})
,
\nu(\sigma)= (\sigma_{\nu(1)},\ldots,
\sigma_{\nu(k)})
,
\nu(\eta)=(\eta_{\nu(1)},\ldots,\eta_{\nu(k)})
be the
corresponding permutations of \mu
, \sigma
and \eta
.
Denote by V
the set of all the permutations of the indexes
{1,\ldots,k }
, the likelihood above is invariant under any
permutation \nu \in V
, that is
L(y;\mu,\eta,\sigma) = L(y;\nu(\mu),\nu(\eta),\nu(\sigma)).
As a consequence, the model is unidentified with respect to an
arbitrary permutation of the labels.
When Bayesian inference for the model is performed,
if the prior distribution p_0(\mu,\eta,\sigma)
is invariant under a permutation of the indices, then so is the posterior. That is, if p_0(\mu,\eta,\sigma) = p_0(\nu(\mu),\nu(\eta),\sigma)
, then
p(\mu,\eta,\sigma| y) \propto p_0(\mu,\eta,\sigma)L(y;\mu,\eta,\sigma)
is multimodal with (at least) k!
modes.
Depending on the selected software, the model parametrization
changes in terms of the prior choices.
Precisely, the JAGS philosophy with the underlying Gibbs sampling
is to use noninformative priors, and conjugate priors are
preferred for computational speed.
Conversely, Stan adopts weakly informative priors,
with no need to explicitly use the conjugacy.
For univariate mixtures, when
software="rjags"
the specification is the same as the function BMMmodel
of the
bayesmix
package:
\mu_j \sim \mathcal{N}(\mu_0, 1/B0inv)
\sigma_j \sim \mbox{invGamma}(nu0Half, nu0S0Half)
\eta \sim \mbox{Dirichlet}(1,\ldots,1)
S0 \sim \mbox{Gamma}(g0Half, g0G0Half),
with default values: \mu_0=0, B0inv=0.1, nu0Half =10, S0=2,
nu0S0Half= nu0Half\times S0,
g0Half = 5e-17, g0G0Half = 5e-33
, in accordance with the default
specification:
priors=list(kind = "independence", parameter = "priorsFish",
hierarchical = "tau")
(see bayesmix
for further details and choices).
When software="rstan"
, the prior specification is:
\mu_j \sim \mathcal{N}(\mu_0, 1/B0inv)
\sigma_j \sim \mbox{Lognormal}(\mu_{\sigma}, \tau_{\sigma})
\eta_j \sim \mbox{Uniform}(0,1),
with default values: \mu_0=0, B0inv=0.1, \mu_{\sigma}=0, \tau_{\sigma}=2
.
The users may specify new hyperparameter values with the argument:
priors=list(mu_0=1, B0inv=0.2, mu_sigma=3, tau_sigma=5)
For multivariate mixtures, when software="rjags"
the prior specification is the following:
\bm{\mu}_j \sim \mathcal{N}_D(\bm{\mu}_0, S2)
\Sigma^{-1} \sim \mbox{Wishart}(S3, D+1)
\eta \sim \mbox{Dirichlet}(\bm{\alpha}),
where \bm{\alpha}
is a k
-dimensional vector
and S_2
and S_3
are positive definite matrices. By default, \bm{\mu}_0=\bm{0}
,
\bm{\alpha}=(1,\ldots,1)
and S_2
and S_3
are diagonal matrices,
with diagonal elements
equal to 1e+05. The user may specify other values for the hyperparameters
\bm{\mu}_0, S_2, S_3
and \bm{\alpha}
via priors
argument in such a way:
priors =list(mu_0 = c(1,1), S2 = ..., S3 = ..., alpha = ...)
with the constraint for S2
and S3
to be positive definite,
and \bm{\alpha}
a vector of dimension k
with nonnegative elements.
When software="rstan"
, the prior specification is:
\bm{\mu}_j \sim \mathcal{N}_D(\bm{\mu}_0, LD*L^{T})
L \sim \mbox{LKJ}(\epsilon)
D^*_j \sim \mbox{HalfCauchy}(0, \sigma_d).
The covariance matrix is expressed in terms of the LDL decomposition as LD*L^{T}
,
a variant of the classical Cholesky decomposition, where L
is a D \times D
lower unit triangular matrix and D*
is a D \times D
diagonal matrix.
The Cholesky correlation factor L
is assigned a LKJ prior with \epsilon
degrees of freedom, which,
combined with priors on the standard deviations of each component, induces a prior on the covariance matrix;
as \epsilon \rightarrow \infty
the magnitude of correlations between components decreases,
whereas \epsilon=1
leads to a uniform prior distribution for L
.
By default, the hyperparameters are \bm{\mu}_0=\bm{0}
, \sigma_d=2.5, \epsilon=1
.
The user may propose some different values with the argument:
priors=list(mu_0=c(1,2), sigma_d = 4, epsilon =2)
If software="rjags"
the function performs JAGS sampling using the bayesmix
package
for univariate Gaussian mixtures, and the runjags
package for multivariate Gaussian mixtures. If software="rstan"
the function performs
Hamiltonian Monte Carlo (HMC) sampling via the rstan
package (see the vignette and the Stan project
for any help).
After MCMC sampling, this function
clusters the units in k
groups,
calls the piv_sel()
function and yields the
pivots obtained from one among four different
methods (the user may specify one among them via piv.criterion
argument):
"maxsumint"
, "minsumnoint"
, "maxsumdiff"
and "MUS"
(available only if k <= 4
)
(see the vignette for thorough details). Due to computational reasons
clarified in the Details section of the function piv_rel
, the
length of the MCMC chains will be minor or equal than the input
argument nMC
; this length, corresponding to the value
true.iter
returned by the procedure, is the number of
MCMC iterations for which
the number of JAGS/Stan groups exactly coincides with the prespecified
number of groups k
.
Value
The function gives the MCMC output, the clustering solutions and the pivotal indexes. Here there is a complete list of outputs.
true.iter |
The number of MCMC iterations for which
the number of JAGS/Stan groups exactly coincides with the prespecified
number of groups |
Mu |
An estimate of the groups' means. |
groupPost |
|
mcmc_mean |
If |
mcmc_sd |
If |
mcmc_weight |
A |
mcmc_mean_raw |
If |
mcmc_sd_raw |
If |
mcmc_weight_raw |
A |
C |
The |
grr |
The vector of cluster membership returned by
|
pivots |
The vector of indices of pivotal units identified by the selected pivotal criterion. |
model |
The JAGS/Stan model code. Apply the |
stanfit |
An object of S4 class |
Author(s)
Leonardo Egidi legidi@units.it
References
Egidi, L., Pappadà , R., Pauli, F. and Torelli, N. (2018). Relabelling in Bayesian Mixture Models by Pivotal Units. Statistics and Computing, 28(4), 957-969.
Examples
### Bivariate simulation
## Not run:
N <- 200
k <- 4
D <- 2
nMC <- 1000
M1 <- c(-.5,8)
M2 <- c(25.5,.1)
M3 <- c(49.5,8)
M4 <- c(63.0,.1)
Mu <- rbind(M1,M2,M3,M4)
Sigma.p1 <- diag(D)
Sigma.p2 <- 20*diag(D)
W <- c(0.2,0.8)
sim <- piv_sim(N = N, k = k, Mu = Mu,
Sigma.p1 = Sigma.p1,
Sigma.p2 = Sigma.p2, W = W)
## rjags (default)
res <- piv_MCMC(y = sim$y, k =k, nMC = nMC)
## rstan
res_stan <- piv_MCMC(y = sim$y, k =k, nMC = nMC,
software ="rstan")
# changing priors
res2 <- piv_MCMC(y = sim$y,
priors = list (
mu_0=c(1,1),
S2 = matrix(c(0.002,0,0, 0.1),2,2, byrow=TRUE),
S3 = matrix(c(0.1,0,0,0.1), 2,2, byrow =TRUE)),
k = k, nMC = nMC)
## End(Not run)
### Fishery data (bayesmix package)
## Not run:
library(bayesmix)
data(fish)
y <- fish[,1]
k <- 5
nMC <- 5000
res <- piv_MCMC(y = y, k = k, nMC = nMC)
# changing priors
res2 <- piv_MCMC(y = y,
priors = list(kind = "condconjugate",
parameter = "priorsRaftery",
hierarchical = "tau"), k =k, nMC = nMC)
## End(Not run)