multinomialLogitMix-package {multinomialLogitMix}R Documentation

Clustering Multinomial Count Data under the Presence of Covariates

Description

Methods for model-based clustering of multinomial counts under the presence of covariates using mixtures of multinomial logit models, as implemented in Papastamoulis (2023) <DOI:10.1007/s11634-023-00547-5>. These models are estimated under a frequentist as well as a Bayesian setup using the Expectation-Maximization algorithm and Markov chain Monte Carlo sampling (MCMC), respectively. The (unknown) number of clusters is selected according to the Integrated Completed Likelihood criterion (for the frequentist model), and estimating the number of non-empty components using overfitting mixture models after imposing suitable sparse prior assumptions on the mixing proportions (in the Bayesian case), see Rousseau and Mengersen (2011) <DOI:10.1111/j.1467-9868.2011.00781.x>. In the latter case, various MCMC chains run in parallel and are allowed to switch states. The final MCMC output is suitably post-processed in order to undo label switching using the Equivalence Classes Representatives (ECR) algorithm, as described in Papastamoulis (2016) <DOI:10.18637/jss.v069.c01>.

Details

The DESCRIPTION file:

Package: multinomialLogitMix
Type: Package
Title: Clustering Multinomial Count Data under the Presence of Covariates
Version: 1.1
Date: 2023-07-13
Authors@R: c(person(given = "Panagiotis", family = "Papastamoulis", email = "papapast@yahoo.gr", role = c( "aut", "cre"), comment = c(ORCID = "0000-0001-9468-7613")))
Maintainer: Panagiotis Papastamoulis <papapast@yahoo.gr>
Description: Methods for model-based clustering of multinomial counts under the presence of covariates using mixtures of multinomial logit models, as implemented in Papastamoulis (2023) <DOI:10.1007/s11634-023-00547-5>. These models are estimated under a frequentist as well as a Bayesian setup using the Expectation-Maximization algorithm and Markov chain Monte Carlo sampling (MCMC), respectively. The (unknown) number of clusters is selected according to the Integrated Completed Likelihood criterion (for the frequentist model), and estimating the number of non-empty components using overfitting mixture models after imposing suitable sparse prior assumptions on the mixing proportions (in the Bayesian case), see Rousseau and Mengersen (2011) <DOI:10.1111/j.1467-9868.2011.00781.x>. In the latter case, various MCMC chains run in parallel and are allowed to switch states. The final MCMC output is suitably post-processed in order to undo label switching using the Equivalence Classes Representatives (ECR) algorithm, as described in Papastamoulis (2016) <DOI:10.18637/jss.v069.c01>.
License: GPL-2
Imports: Rcpp (>= 1.0.8.3), MASS, doParallel, foreach, label.switching, ggplot2, coda, matrixStats, mvtnorm, RColorBrewer
LinkingTo: Rcpp, RcppArmadillo
Author: Panagiotis Papastamoulis [aut, cre] (<https://orcid.org/0000-0001-9468-7613>)

Index of help topics:

dealWithLabelSwitching
                        Post-process the generated MCMC sample in order
                        to undo possible label switching.
expected_complete_LL    Expected complete LL
gibbs_mala_sampler      The core of the Hybrid Gibbs/MALA MCMC sampler
                        for the multinomial logit mixture.
gibbs_mala_sampler_ppt
                        Prior parallel tempering scheme of hybrid
                        Gibbs/MALA MCMC samplers for the multinomial
                        logit mixture.
log_dirichlet_pdf       Log-density function of the Dirichlet
                        distribution
mala_proposal           Proposal mechanism of the MALA step.
mixLoglikelihood_GLM    Log-likelihood of the multinomial logit.
mix_mnm_logistic        EM algorithm
multinomialLogitMix     Main function
multinomialLogitMix-package
                        Clustering Multinomial Count Data under the
                        Presence of Covariates
multinomial_logistic_EM
                        Part of the EM algorithm for multinomial logit
                        mixture
myDirichlet             Simulate from the Dirichlet distribution
newton_raphson_mstep    M-step of the EM algorithm
shakeEM_GLM             Shake-small EM
simulate_multinomial_data
                        Synthetic data generator
splitEM_GLM             Split-small EM scheme.

See the main function of the package: multinomialLogitMix, which wraps automatically calls to the MCMC sampler gibbs_mala_sampler_ppt and the EM algorithm mix_mnm_logistic.

Author(s)

NA

Maintainer: Panagiotis Papastamoulis <papapast@yahoo.gr>

References

Papastamoulis, P. Model based clustering of multinomial count data. Advances in Data Analysis and Classification (2023). https://doi.org/10.1007/s11634-023-00547-5

Papastamoulis, P. and Iliopoulos, G. (2010). An Artificial Allocations Based Solution to the Label Switching Problem in Bayesian Analysis of Mixtures of Distributions. Journal of Computational and Graphical Statistics, 19(2), 313-331. http://www.jstor.org/stable/25703571

Papastamoulis, P. (2016). label.switching: An R Package for Dealing with the Label Switching Problem in MCMC Outputs. Journal of Statistical Software, Code Snippets, 69(1), 1-24. https://doi.org/10.18637/jss.v069.c01

Rousseau, J. and Mengersen, K. (2011), Asymptotic behaviour of the posterior distribution in overfitted mixture models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73: 689-710. https://doi.org/10.1111/j.1467-9868.2011.00781.x

See Also

multinomialLogitMix, gibbs_mala_sampler_ppt,mix_mnm_logistic


[Package multinomialLogitMix version 1.1 Index]