PYdensity {BNPmix} | R Documentation |
MCMC for Pitman-Yor mixtures of Gaussians
Description
The PYdensity
function generates a posterior density sample for a selection of univariate and multivariate Pitman-Yor
process mixture models with Gaussian kernels. See details below for the description of the different specifications of the implemented models.
Usage
PYdensity(y, mcmc = list(), prior = list(), output = list())
Arguments
y |
a vector or matrix giving the data based on which the density is to be estimated;
|
mcmc |
a list of MCMC arguments:
-
niter (mandatory), number of iterations.
-
nburn (mandatory), number of iterations to discard as burn-in.
-
method , the MCMC sampling method to be used, options are 'ICS' , 'MAR' and 'SLI' (default is 'ICS' ). See details.
-
model , the type of model to be fitted (default is 'LS' ). See details.
-
nupd , argument controlling the number of iterations to be displayed on screen: the function reports
on standard output every time nupd new iterations have been carried out (default is niter/10 ).
-
print_message , control option. If equal to TRUE , the status is printed
to standard output every nupd iterations (default is TRUE ).
-
m_imp , number of generated values for the importance sampling step of method = 'ICS' (default is 10). See details.
-
slice_type , when method = 'SLI' it specifies the type of slice sampler. Options are 'DEP' for dependent slice-efficient, and 'INDEP' for independent slice-efficient (default is 'DEP' ). See details.
-
hyper , if equal to TRUE , hyperprior distributions on the base measure's
parameters are added, as specified in prior and explained in details (default is TRUE ).
|
prior |
a list giving the prior information. The list includes strength and discount ,
the strength and discount parameters of the Pitman-Yor process
(default are strength = 1 and discount = 0 , the latter leading to the Dirichlet process).
The remaining parameters depend on the model choice.
If model = 'L' (location mixture) and y is univariate:
m0 and s20 are
mean and variance of the base measure on the location parameter (default are sample mean and sample variance of the data);
a0 and b0 are shape and scale parameters of the inverse gamma prior on the common scale parameter
(default are 2 and the sample variance of the data).
If hyper = TRUE , optional hyperpriors on the base measure's parameters are added:
specifically, m1 and k1 are the mean parameter and the scale factor defining the
normal hyperprior on m0 (default are the sample mean of the data and 1), and
a1 and b1 are shape and rate parameters of the inverse gamma hyperprior on b0
(default are 2 and the sample variance of the data). See details.
If model = 'LS' (location-scale mixture) and y is univariate:
m0 and k0 are the mean parameter and the scale factor defining the normal base measure
on the location parameter (default are the sample mean of the data and 1), and a0 and b0 are
shape and scale parameters of the inverse gamma base measure on the scale parameter (default are 2 and the sample variance of the data).
If hyper = TRUE , optional hyperpriors on the base measure's parameters are added:
specifically, m1 and s21 are mean and variance parameters of the normal hyperprior on
m0 (default are the sample mean and the sample variance of the data);
tau1 and zeta1 are shape and rate parameters of the gamma hyperprior on
k0 (default is 1 for both);
a1 and b1 are shape and rate parameters of the gamma hyperprior on
b0 (default are the sample variance of the data and 1). See details.
If model = 'L' (location mixture) and y is multivariate (p -variate):
m0 and S20 are
mean and covariance of the base measure on the location parameter (default are the sample mean and the sample covariance of the data);
Sigma0 and n0 are the parameters of the inverse Whishart prior on
the common scale matrix (default are the sample covariance of the data and p +2).
If hyper = TRUE , optional hyperpriors on the base measure's parameters are added:
specifically, m1 and k1 are the mean parameter and the scale factor defining the
normal hyperprior on m0 (default are the sample mean of the data and 1), and
lambda and Lambda1 are the parameters (degrees of freedom and scale) of the inverse Wishart prior on S20
(default are p +2 and the sample covariance of the data). See details.
If model = 'LS' (location-scale mixture) and y is multivariate (p -variate):
m0 and k0 are the mean parameter and the scale factor defining the normal base measure on the
location parameter (default are the sample mean of the data and 1), and
n0 and Sigma0 are the parameters (degrees of freedom and scale) of the inverse Wishart base measure on the location parameter
(default are p +2 and the sample covariance of the data).
If hyper = TRUE , optional hyperpriors on the base measure's parameters are added:
specifically, m1 and S1 are mean and covariance matrix parameters of the normal hyperprior on
m0 (default are the sample mean and the sample covariance of the data);
tau1 and zeta1 are shape and rate parameters of the gamma hyperprior on
k0 (default is 1 for both);
n1 and Sigma1 are the parameters (degrees of freedom and scale) of the Wishart prior for Sigma0
(default are p +2 and the sample covariance of the data divided p +2). See details.
If model = 'DLS' (diagonal location-scale mixture):
m0 and k0 are the mean vector parameter and the vector of scale factors defining the normal base measure
on the location parameter (default are the sample mean and a vector of ones),
and a0 and b0 are vectors of
shape and scale parameters defining the base measure on the scale parameters (default are a vector of twos and the diagonal
of the sample covariance of the data).
If hyper = TRUE , optional hyperpriors on the base measure's parameters are added:
specifically, m1 and s21 are vectors of mean and variance parameters for the normal hyperpriors on the components of
m0 (default are the sample mean and the diagonal of the sample covariance of the data);
tau1 and zeta1 are vectors of shape and rate parameters of the gamma hyperpriors on the components of
k0 (default is a vector of ones for both);
a1 and b1 are vectors of shape and rate parameters of the gamma hyperpriors on the components of
b0 (default is the diagonal of the sample covariance of the data and a vector of ones). See details.
|
output |
a list of arguments for generating posterior output. It contains:
-
grid , a grid of points at which to evaluate the estimated posterior mean density; a data frame
obtained with the expand.grid function.
-
out_param , if equal to TRUE , the function returns the draws of the kernel's
parameters for each MCMC iteration, default is FALSE . See value for details.
-
out_type , if out_type = "FULL" , the function returns the visited
partitions and the realizations of the posterior density for each iterations.
If out_type = "MEAN" , the function returns the estimated partitions and the mean of the densities sampled at each iterations.
If out_type = "CLUST" , the function returns the estimated partition.
Default "FULL" .
|
Details
This generic function fits a Pitman-Yor process mixture model for density estimation and clustering. The general model is
f~(y)=∫K(y;θ)p~(dθ),
where K(y;θ)
is a kernel density with parameter
θ∈Θ
. Univariate and multivariate Gaussian kernels are implemented with different specifications for the parametric space
Θ
, as described below.
The mixing measure p~
has a Pitman-Yor process prior with strength parameter ϑ
,
discount parameter α
, and base measure P0
admitting the specifications presented below. For posterior sampling,
three MCMC approaches are implemented. See details below.
Univariate data
For univariate y
the function implements both a location and location-scale mixture model. The former assumes
f~(y)=∫ϕ(y;μ,σ2)p~(dμ)π(σ2),
where
ϕ(y;μ,σ2)
is a univariate Gaussian kernel function with mean μ
and variance σ2
,
and π(σ2)
is an inverse gamma prior. The base measure is specified as
P0(dμ)=N(dμ;m0,σ02),
and σ2∼IGa(a0,b0)
.
Optional hyperpriors for the base measure's parameters are
(m0,σ02)∼N(m1,σ02/k1)×IGa(a1,b1).
The location-scale mixture model, instead, assumes
f~(y)=∫ϕ(y;μ,σ2)p~(dμ,dσ2)
with normal-inverse gamma base measure
P0(dμ,dσ2)=N(dμ;m0,σ2/k0)×IGa(dσ2;a0,b0).
and (optional) hyperpriors
m0∼N(m1,σ12),k0∼Ga(τ1,ζ1),b0∼Ga(a1,b1).
Multivariate data
For multivariate y
(p
-variate) the function implements a location mixture model (with full covariance matrix) and two
different location-scale mixture models, with either full or diagonal covariance matrix. The location mixture model assumes
f~(y)=∫ϕp(y;μ,Σ)p~(dμ)π(Σ)
where
ϕp(y;μ,Σ)
is a p
-dimensional Gaussian kernel function with mean vector μ
and covariance matrix
Σ
. The prior on Σ
is inverse Whishart with parameters Σ0
and ν0
, while the
base measure is
P0(dμ)=N(dμ;m0,S0),
with optional hyperpriors
m0∼N(m1,S0/k1),S0∼IW(λ1,Λ1).
The location-scale mixture model assumes
f~(x)=∫ϕp(y;μ,Σ)p~(dμ,dΣ).
Two possible structures for Σ
are implemented, namely full and diagonal covariance. For the full covariance mixture model, the base measure is
the normal-inverse Wishart
P0(dμ,dΣ)=N(dμ;m0,Σ/k0)×IW(dΣ;ν0,Σ0),
with optional hyperpriors
m0∼N(m1,S1),k0∼Ga(τ1,ζ1),b0∼W(ν1,Σ1).
The second location-scale mixture model assumes a diagonal covariance structure. This is equivalent to write the
mixture model as a mixture of products of univariate normal kernels, i.e.
f~(y)=∫∏r=1pϕ(yr;μr,σr2)p~(dμ1,…,dμp,dσ12,…,dσp2).
For this specification, the base measure is assumed defined as the product of p
independent normal-inverse gamma distributions, that is
P0=∏r=1pP0r
where
P0r(dμr,dσr2)=N(dμr;m0r,σr2/k0r)×Ga(dσr2;a0r,b0r).
Optional hyperpriors can be added, and, for each component, correspond to the set of hyperpriors considered
for the univariate location-scale mixture model.
Posterior simulation methods
This generic function implements three types of MCMC algorithms for posterior simulation.
The default method is the importance conditional sampler 'ICS'
(Canale et al. 2019). Other options are
the marginal sampler 'MAR'
(Neal, 2000) and the slice sampler 'SLI'
(Kalli et al. 2011).
The importance conditional sampler performs an importance sampling step when updating the values of
individual parameters θ
, which requires to sample m_imp
values from a suitable
proposal. Large values of m_imp
are known to improve the mixing of the chain
at the cost of increased running time (Canale et al. 2019). Two options are available for the slice sampler,
namely the dependent slice-efficient sampler (slice_type = 'DEP'
), which is set as default, and the
independent slice-efficient sampler (slice_type = 'INDEP'
) (Kalli et al. 2011). See Corradin et al. (to appear)
for more details.
Value
A BNPdens
class object containing the estimated density and
the cluster allocations for each iterations. If out_param = TRUE
the output
contains also the kernel specific parameters for each iteration. If mcmc_dens = TRUE
the output
contains also a realization from the posterior density for each iteration. IF mean_dens = TRUE
the output contains just the mean of the realizations from the posterior density. The output contains
also informations as the number of iterations, the number of burn-in iterations, the used
computational time and the type of estimated model (univariate = TRUE
or FALSE
).
References
Canale, A., Corradin, R., Nipoti, B. (2019), Importance conditional sampling for Bayesian nonparametric mixtures,
arXiv preprint, arXiv:1906.08147
Corradin, R., Canale, A., Nipoti, B. (2021), BNPmix: An R Package for Bayesian Nonparametric Modeling via Pitman-Yor Mixtures,
Journal of Statistical Software, 100, doi:10.18637/jss.v100.i15
Kalli, M., Griffin, J. E., and Walker, S. G. (2011), Slice sampling mixture models.
Statistics and Computing 21, 93-105, doi:10.1007/s11222-009-9150-y
Neal, R. M. (2000), Markov Chain Sampling Methods for Dirichlet Process Mixture Models,
Journal of Computational and Graphical Statistics 9, 249-265, doi:10.2307/1390653
Examples
data_toy <- cbind(c(rnorm(100, -3, 1), rnorm(100, 3, 1)),
c(rnorm(100, -3, 1), rnorm(100, 3, 1)))
grid <- expand.grid(seq(-7, 7, length.out = 50),
seq(-7, 7, length.out = 50))
est_model <- PYdensity(y = data_toy, mcmc = list(niter = 200, nburn = 100),
output = list(grid = grid))
summary(est_model)
plot(est_model)
[Package
BNPmix version 1.0.2
Index]