PYdensity {BNPmix} | R Documentation |
MCMC for Pitman-Yor mixtures of Gaussians
Description
The PYdensity
function generates a posterior density sample for a selection of univariate and multivariate Pitman-Yor
process mixture models with Gaussian kernels. See details below for the description of the different specifications of the implemented models.
Usage
PYdensity(y, mcmc = list(), prior = list(), output = list())
Arguments
y |
a vector or matrix giving the data based on which the density is to be estimated; |
mcmc |
a list of MCMC arguments:
|
prior |
a list giving the prior information. The list includes
|
output |
a list of arguments for generating posterior output. It contains:
|
Details
This generic function fits a Pitman-Yor process mixture model for density estimation and clustering. The general model is
\tilde f(y) = \int K(y; \theta) \tilde p (d \theta),
where K(y; \theta)
is a kernel density with parameter
\theta\in\Theta
. Univariate and multivariate Gaussian kernels are implemented with different specifications for the parametric space
\Theta
, as described below.
The mixing measure \tilde p
has a Pitman-Yor process prior with strength parameter \vartheta
,
discount parameter \alpha
, and base measure P_0
admitting the specifications presented below. For posterior sampling,
three MCMC approaches are implemented. See details below.
Univariate data
For univariate y
the function implements both a location and location-scale mixture model. The former assumes
\tilde f(y) = \int \phi(y; \mu, \sigma^2) \tilde p (d \mu) \pi(\sigma^2),
where
\phi(y; \mu, \sigma^2)
is a univariate Gaussian kernel function with mean \mu
and variance \sigma^2
,
and \pi(\sigma^2)
is an inverse gamma prior. The base measure is specified as
P_0(d \mu) = N(d \mu; m_0, \sigma^2_0),
and \sigma^2 \sim IGa(a_0, b_0)
.
Optional hyperpriors for the base measure's parameters are
(m_0,\sigma^2_0) \sim N(m_1, \sigma^2_0 / k_1) \times IGa(a_1, b_1).
The location-scale mixture model, instead, assumes
\tilde f(y) = \int \phi(y; \mu, \sigma^2) \tilde p (d \mu, d \sigma^2)
with normal-inverse gamma base measure
P_0 (d \mu, d \sigma^2) = N(d \mu; m_0, \sigma^2 / k_0) \times IGa(d \sigma^2; a_0, b_0).
and (optional) hyperpriors
m_0 \sim N(m_1, \sigma_1^2 ),\quad k_0 \sim Ga(\tau_1, \zeta_1),\quad b_0 \sim Ga(a_1, b_1).
Multivariate data
For multivariate y
(p
-variate) the function implements a location mixture model (with full covariance matrix) and two
different location-scale mixture models, with either full or diagonal covariance matrix. The location mixture model assumes
\tilde f(y) = \int \phi_p(y; \mu, \Sigma) \tilde p (d \mu) \pi(\Sigma)
where
\phi_p(y; \mu, \Sigma)
is a p
-dimensional Gaussian kernel function with mean vector \mu
and covariance matrix
\Sigma
. The prior on \Sigma
is inverse Whishart with parameters \Sigma_0
and \nu_0
, while the
base measure is
P_0(d \mu) = N(d \mu; m_0, S_0),
with optional hyperpriors
m_0 \sim N(m_1, S_0 / k_1),\quad S_0 \sim IW(\lambda_1, \Lambda_1).
The location-scale mixture model assumes
\tilde f(x) = \int \phi_p(y; \mu, \Sigma) \tilde p (d \mu, d \Sigma).
Two possible structures for \Sigma
are implemented, namely full and diagonal covariance. For the full covariance mixture model, the base measure is
the normal-inverse Wishart
P_0 (d \mu, d \Sigma) = N(d \mu; m_0, \Sigma / k_0) \times IW(d \Sigma; \nu_0, \Sigma_0),
with optional hyperpriors
m_0 \sim N(m_1, S_1),\quad k_0 \sim Ga(\tau_1, \zeta_1),\quad b_0 \sim W(\nu_1, \Sigma_1).
The second location-scale mixture model assumes a diagonal covariance structure. This is equivalent to write the mixture model as a mixture of products of univariate normal kernels, i.e.
\tilde f(y) = \int \prod_{r=1}^p \phi(y_r; \mu_r, \sigma^2_r) \tilde p (d \mu_1,\ldots,d \mu_p, d \sigma_1^2,\ldots,d \sigma_p^2).
For this specification, the base measure is assumed defined as the product of p
independent normal-inverse gamma distributions, that is
P_0 = \prod_{r=1}^p P_{0r}
where
P_{0r}(d \mu_r,d \sigma_r^2) = N(d \mu_r; m_{0r}, \sigma^2_r / k_{0r}) \times Ga(d \sigma^2_r; a_{0r}, b_{0r}).
Optional hyperpriors can be added, and, for each component, correspond to the set of hyperpriors considered for the univariate location-scale mixture model.
Posterior simulation methods
This generic function implements three types of MCMC algorithms for posterior simulation.
The default method is the importance conditional sampler 'ICS'
(Canale et al. 2019). Other options are
the marginal sampler 'MAR'
(Neal, 2000) and the slice sampler 'SLI'
(Kalli et al. 2011).
The importance conditional sampler performs an importance sampling step when updating the values of
individual parameters \theta
, which requires to sample m_imp
values from a suitable
proposal. Large values of m_imp
are known to improve the mixing of the chain
at the cost of increased running time (Canale et al. 2019). Two options are available for the slice sampler,
namely the dependent slice-efficient sampler (slice_type = 'DEP'
), which is set as default, and the
independent slice-efficient sampler (slice_type = 'INDEP'
) (Kalli et al. 2011). See Corradin et al. (to appear)
for more details.
Value
A BNPdens
class object containing the estimated density and
the cluster allocations for each iterations. If out_param = TRUE
the output
contains also the kernel specific parameters for each iteration. If mcmc_dens = TRUE
the output
contains also a realization from the posterior density for each iteration. IF mean_dens = TRUE
the output contains just the mean of the realizations from the posterior density. The output contains
also informations as the number of iterations, the number of burn-in iterations, the used
computational time and the type of estimated model (univariate = TRUE
or FALSE
).
References
Canale, A., Corradin, R., Nipoti, B. (2019), Importance conditional sampling for Bayesian nonparametric mixtures, arXiv preprint, arXiv:1906.08147
Corradin, R., Canale, A., Nipoti, B. (2021), BNPmix: An R Package for Bayesian Nonparametric Modeling via Pitman-Yor Mixtures, Journal of Statistical Software, 100, doi:10.18637/jss.v100.i15
Kalli, M., Griffin, J. E., and Walker, S. G. (2011), Slice sampling mixture models. Statistics and Computing 21, 93-105, doi:10.1007/s11222-009-9150-y
Neal, R. M. (2000), Markov Chain Sampling Methods for Dirichlet Process Mixture Models, Journal of Computational and Graphical Statistics 9, 249-265, doi:10.2307/1390653
Examples
data_toy <- cbind(c(rnorm(100, -3, 1), rnorm(100, 3, 1)),
c(rnorm(100, -3, 1), rnorm(100, 3, 1)))
grid <- expand.grid(seq(-7, 7, length.out = 50),
seq(-7, 7, length.out = 50))
est_model <- PYdensity(y = data_toy, mcmc = list(niter = 200, nburn = 100),
output = list(grid = grid))
summary(est_model)
plot(est_model)