R: Bayesian estimation of mixture distributions

bayes_fit {BayesMultiMode}

R Documentation

Bayesian estimation of mixture distributions

Description

Estimation of a univariate mixture with unknown number of components using a sparse finite mixture Markov chain Monte Carlo (SFM MCMC) algorithm.

Usage

bayes_fit(
  data,
  K,
  dist,
  priors = list(),
  nb_iter = 2000,
  burnin = nb_iter/2,
  print = TRUE
)

Arguments

`data`	Vector of observations.
`K`	Maximum number of mixture components.
`dist`	String indicating the distribution of the mixture components; currently supports `"normal"`, `"skew_normal"`, `"poisson"` and `"shifted_poisson"`.
`priors`	List of priors; default is an empty list which implies the following priors: `a0 = 1`, `A0 = 200`, `b0 = median(y)`, `B0 = (max(y) - min(y))^2` (normal), `D_xi = 1`, `D_psi =1`, (skew normal: `B0 = diag(D_xi,D_psi)`), `c0 = 2.5`, `l0 = 1.1` (poisson), `l0 = 5` (shifted poisson), `L0 = 1.1/median(y)`, `L0 = l0 - 1` (shifted poisson), `g0 = 0.5`, `G0 = 100g0/c0/B0` (normal), `G0 = g0/(0.5var(y))` (skew normal).
`nb_iter`	Number of MCMC iterations; default is `2000`.
`burnin`	Number of MCMC iterations used as burnin; default is `nb_iter/2`.
`print`	Showing MCMC progression ? Default is `TRUE`.

Details

Let y_i, i=1,\dots,n denote observations. A general mixture of K distributions from the same parametric family is given by:

y_i \sim \sum_{k=1}^{K}\pi_k p(\cdot|\theta_k)

with \sum_{k=1}^{K}\pi_k=1 and \pi_k\geq 0, k=1, ...,K.

The exact number of components does not have to be known a priori when using an SFM MCMC approach. Rather, an upper bound is specified for the number of components and the weights of superfluous components are shrunk towards zero during estimation. Following Malsiner-Walli et al. (2016) a symmetric Dirichlet prior is used for the mixture weights:

\pi_k \sim \text{Dirichlet}(e_0,\dots,e_0),

where a Gamma hyperprior is used on the concentration parameter e_0:

e_0 \sim \text{Gamma}\left(a_0, A_0\right).

Mixture of Normal distributions

Normal components take the form:

p(y_i|\mu_k,\sigma_k) = \frac{1}{\sqrt{2 \pi} \ \sigma_k} \exp\left( - \, \frac{1}{2} \left( \frac{y_i - \mu_k}{\sigma_k} \right)^2 \right).

Independent conjugate priors are used for \mu_k and \sigma^2_k (see for instance Malsiner-Walli et al. 2016):

\mu_k \sim \text{Normal}( \text{b}_0, \text{B}_0),

\sigma^{-2}_k \sim \text{Gamma}( \text{c}_0, \text{C}_0),

C_0 \sim \text{Gamma}( \text{g}_0, \text{G}_0).

Mixture of skew-Normal distributions

We use the skew-Normal of Azzalini (1985) which takes the form:

p(y_i| \xi_k,\omega_k,\alpha_k) = \frac{1}{\omega_k\sqrt{2\pi}} \ \exp\left( - \, \frac{1}{2} \left( \frac{y_i - \xi_k}{\omega_k} \right)^2\right) \ \left(1 + \text{erf}\left( \alpha_k\left(\frac{y_i - \xi_k}{\omega_k\sqrt{2}}\right)\right)\right),

where \xi_k is a location parameter, \omega_k a scale parameter and \alpha_k the shape parameter introducing skewness. For Bayesian estimation, we adopt the approach of Frühwirth-Schnatter and Pyne (2010) and use the following reparameterised random-effect model:

z_i \sim TN_{[0,\infty)}(0, 1),

y_i|(S_i = k) = \xi_k + \psi_k z_i + \epsilon_i, \quad \epsilon_i \sim N(0, \sigma^2_k),

where the parameters of the skew-Normal are recovered with

\omega_k = \frac{\psi_k}{\sigma_k}, \qquad \omega^2_k = \sigma^2_k + \psi^2_k.

By defining a regressor x_i = (1, z_i)', the skew-Normal mixture can be seen as random effect model and sampled using standard techniques. Thus we use priors similar to the Normal mixture model:

(\xi_k, \psi_k)' \sim \text{Normal}(\text{b}_0, \text{B}_0),

\sigma^{-2}_k \sim \text{Gamma}(\text{c}_0, \text{C}_0),

\text{C}_0 \sim \text{Gamma}( \text{g}_0, \text{G}_0).

We set

\text{b}_0 = (\text{median}(y), 0)'

and

\text{B}_0 = \text{diag}(\text{D}\_\text{xi}, \text{D}\_\text{psi})

with D_xi = D_psi = 1.

Mixture of Poisson distributions

Poisson components take the form:

p(y_i|\lambda_k) = \frac{1}{y_i!} \, \lambda^{y_i}_k \,\exp(-\lambda_k).

The prior for \lambda_k follows from Viallefont et al. (2002):

\lambda_k \sim \text{Gamma}(\text{l}_0,\text{L}_0).

Mixture of shifted-Poisson distributions

Shifted-Poisson components take the form

p(y_i |\lambda_k, \kappa_k) = \frac{1}{(y_i - \kappa_k)!} \, \lambda^{(y_i - \kappa_k)!}_k \,\exp(-\lambda_k)

where \kappa_k is a location or shift parameter with uniform prior, see Cross et al. (2024).

Value

A list of class bayes_mixture containing:

`data`	Same as argument.
`mcmc`	Matrix of MCMC draws where the rows corresponding to burnin have been discarded;
`mcmc_all`	Matrix of MCMC draws.
`loglik`	Log likelihood at each MCMC draw.
`K`	Number of components.
`dist`	Same as argument.
`pdf_func`	The pdf/pmf of the mixture components.
`dist_type`	Type of the distribution, i.e. continuous or discrete.
`pars_names`	Names of the mixture components' parameters.
`loc`	Name of the location parameter of the mixture components.
`nb_var`	Number of variables/parameters in the mixture distribution.

References

Azzalini A (1985). “A Class of Distributions Which Includes the Normal Ones.” Scandinavian Journal of Statistics, 12(2), 171–178. ISSN 0303-6898, Publisher: [Board of the Foundation of the Scandinavian Journal of Statistics, Wiley].

Cross JL, Hoogerheide L, Labonne P, van Dijk HK (2024). “Bayesian mode inference for discrete distributions in economics and finance.” Economics Letters, 235, 111579. ISSN 0165-1765, doi:10.1016/j.econlet.2024.111579.

Frühwirth-Schnatter S, Pyne S (2010). “Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-t distributions.” Biostatistics, 11(2), 317–336. ISSN 1465-4644, doi:10.1093/biostatistics/kxp062.

Malsiner-Walli G, Fruhwirth-Schnatter S, Grun B (2016). “Model-based clustering based on sparse finite Gaussian mixtures.” Statistics and Computing, 26(1), 303–324. ISSN 1573-1375, doi:10.1007/s11222-014-9500-2.

Viallefont V, Richardson S, Peter J (2002). “Bayesian analysis of Poisson mixtures.” Journal of Nonparametric Statistics, 14(1-2), 181–202.

Examples

# Example with galaxy data ================================================
set.seed(123) 

# retrieve galaxy data
y = galaxy

# estimation
bayesmix = bayes_fit(data = y,
                           K = 5, #not many to run the example rapidly
                           dist = "normal",
                           nb_iter = 500, #not many to run the example rapidly
                           burnin = 100)
                           
# plot estimated mixture
# plot(bayesmix, max_size = 200)

# Changing priors ================================================
set.seed(123) 

# retrieve galaxy data
y = galaxy

# estimation
K = 5
bayesmix = bayes_fit(data = y,
                           K = K, #not many to run the example rapidly
                           dist = "normal",
                           priors = list(a0 = 10,
                                         A0 = 10*K),
                           nb_iter = 500, #not many to run the example rapidly
                           burnin = 100)
                           
# plot estimated mixture
# plot(bayesmix, max_size = 200)

# Example with DNA data =====================================================

set.seed(123) 

# retrieve DNA data
y = d4z4

# estimation
bayesmix = bayes_fit(data = y,
                           K = 5, #not many to run the example rapidly
                           dist = "shifted_poisson",
                           nb_iter = 500, #not many to run the example rapidly
                           burnin = 100)
                           
# plot estimated mixture
# plot(bayesmix, max_size = 200)

[Package BayesMultiMode version 0.7.1 Index]