PYregression {BNPmix}  R Documentation 
The PYregression
function generates a posterior sample
for mixtures of linear regression models inspired by the ANOVADDP model
introduced in De Iorio et al. (2004). See details below for model specification.
PYregression(y, x, mcmc = list(), prior = list(), output = list())
y 
a vector of observations, univariate dependent variable; 
x 
a matrix of observations, multivariate independent variable; 
mcmc 
a list of MCMC arguments:

prior 
a list giving the prior information. The list includes

output 
list of posterior summaries:

This function fits a PitmanYor process mixture of Gaussian linear regression models, i.e
\tilde f(y) = \int φ(y; x^T β, σ^2) \tilde p (d β, d σ^2),
where x is a bivariate vector containing the dependent variable in x
and a value of 1
for the intercept term.
The mixing measure \tilde p has a PitmanYor process prior with strength \vartheta,
discount parameter α. The location model assume a base measures P_0 specified as
P0(d β) = N(d β; m0, S0).
while the locationscale model assume a base measures P_0 specified as
P0(d β, d σ^2) = N(d β; m0, S0) IG(d σ^2; a0, b0).
Optional hyperpriors complete the model specification:
m_0 ~ N(m1, S0/k1), S0 ~ IW(ν1, S1), b0 ~ G(τ1, ζ1).
Posterior simulation methods
This generic function implements three types of MCMC algorithms for posterior simulation.
The default method is the importance conditional sampler 'ICS'
(Canale et al. 2019). Other options are
the marginal sampler 'MAR'
(algorithm 8 of Neal, 2000) and the slice sampler 'SLI'
(Kalli et al. 2011).
The importance conditional sampler performs an importance sampling step when updating the values of
individual parameters θ, which requires to sample m_imp
values from a suitable
proposal. Large values of m_imp
are known to improve the mixing of the posterior distribution
at the cost of increased running time (Canale et al. 2019). When updateing the individual parameter
θ, Algorithm 8 of Neal, 2000, requires to sample m_marginal
values from the base
measure. m_marginal
can be chosen arbitrarily. Two options are available for the slice sampler,
namely the dependent sliceefficient sampler (slice_type = 'DEP'
), which is set as default, and the
independent sliceefficient sampler (slice_type = 'INDEP'
) (Kalli et al. 2011).
A BNPdens
class object containing the estimated density and
the cluster allocations for each iterations. The output contains also the data and
the grids. If out_param = TRUE
the output
contains also the kernel specific parameters for each iteration. If mcmc_dens = TRUE
, the
function returns also a realization from the posterior density for each iteration.
If mean_dens = TRUE
, the output contains just the mean of the densities sampled at each iteration.
The output retuns also the number of iterations,
the number of burnin iterations, the computational time and the type of model.
Canale, A., Corradin, R., Nipoti, B. (2019), Importance conditional sampling for Bayesian nonparametric mixtures, arXiv preprint, arXiv:1906.08147
De Iorio, M., Mueller, P., Rosner, G.L., and MacEachern, S. (2004), An ANOVA Model for Dependent Random Measures, Journal of the American Statistical Association 99, 205215
Kalli, M., Griffin, J. E., and Walker, S. G. (2011), Slice sampling mixture models. Statistics and Computing 21, 93105.
Neal, R. M. (2000), Markov Chain Sampling Methods for Dirichlet Process Mixture Models, Journal of Computational and Graphical Statistics 9, 249265.
x_toy < c(rnorm(100, 3, 1), rnorm(100, 3, 1)) y_toy < c(x_toy[1:100] * 2 + 1, x_toy[101:200] * 6 + 1) + rnorm(200, 0, 1) grid_x < c(0, 1, 2, 3, 4, 5) grid_y < seq(0, 35, length.out = 50) est_model < PYregression(y = y_toy, x = x_toy, mcmc = list(niter = 200, nburn = 100), output = list(grid_x = grid_x, grid_y = grid_y)) summary(est_model) plot(est_model)