multinomialLogitMix-package {multinomialLogitMix} | R Documentation |
Clustering Multinomial Count Data under the Presence of Covariates
Description
Methods for model-based clustering of multinomial counts under the presence of covariates using mixtures of multinomial logit models, as implemented in Papastamoulis (2023) <DOI:10.1007/s11634-023-00547-5>. These models are estimated under a frequentist as well as a Bayesian setup using the Expectation-Maximization algorithm and Markov chain Monte Carlo sampling (MCMC), respectively. The (unknown) number of clusters is selected according to the Integrated Completed Likelihood criterion (for the frequentist model), and estimating the number of non-empty components using overfitting mixture models after imposing suitable sparse prior assumptions on the mixing proportions (in the Bayesian case), see Rousseau and Mengersen (2011) <DOI:10.1111/j.1467-9868.2011.00781.x>. In the latter case, various MCMC chains run in parallel and are allowed to switch states. The final MCMC output is suitably post-processed in order to undo label switching using the Equivalence Classes Representatives (ECR) algorithm, as described in Papastamoulis (2016) <DOI:10.18637/jss.v069.c01>.
Details
The DESCRIPTION file:
Package: | multinomialLogitMix |
Type: | Package |
Title: | Clustering Multinomial Count Data under the Presence of Covariates |
Version: | 1.1 |
Date: | 2023-07-13 |
Authors@R: | c(person(given = "Panagiotis", family = "Papastamoulis", email = "papapast@yahoo.gr", role = c( "aut", "cre"), comment = c(ORCID = "0000-0001-9468-7613"))) |
Maintainer: | Panagiotis Papastamoulis <papapast@yahoo.gr> |
Description: | Methods for model-based clustering of multinomial counts under the presence of covariates using mixtures of multinomial logit models, as implemented in Papastamoulis (2023) <DOI:10.1007/s11634-023-00547-5>. These models are estimated under a frequentist as well as a Bayesian setup using the Expectation-Maximization algorithm and Markov chain Monte Carlo sampling (MCMC), respectively. The (unknown) number of clusters is selected according to the Integrated Completed Likelihood criterion (for the frequentist model), and estimating the number of non-empty components using overfitting mixture models after imposing suitable sparse prior assumptions on the mixing proportions (in the Bayesian case), see Rousseau and Mengersen (2011) <DOI:10.1111/j.1467-9868.2011.00781.x>. In the latter case, various MCMC chains run in parallel and are allowed to switch states. The final MCMC output is suitably post-processed in order to undo label switching using the Equivalence Classes Representatives (ECR) algorithm, as described in Papastamoulis (2016) <DOI:10.18637/jss.v069.c01>. |
License: | GPL-2 |
Imports: | Rcpp (>= 1.0.8.3), MASS, doParallel, foreach, label.switching, ggplot2, coda, matrixStats, mvtnorm, RColorBrewer |
LinkingTo: | Rcpp, RcppArmadillo |
Author: | Panagiotis Papastamoulis [aut, cre] (<https://orcid.org/0000-0001-9468-7613>) |
Index of help topics:
dealWithLabelSwitching Post-process the generated MCMC sample in order to undo possible label switching. expected_complete_LL Expected complete LL gibbs_mala_sampler The core of the Hybrid Gibbs/MALA MCMC sampler for the multinomial logit mixture. gibbs_mala_sampler_ppt Prior parallel tempering scheme of hybrid Gibbs/MALA MCMC samplers for the multinomial logit mixture. log_dirichlet_pdf Log-density function of the Dirichlet distribution mala_proposal Proposal mechanism of the MALA step. mixLoglikelihood_GLM Log-likelihood of the multinomial logit. mix_mnm_logistic EM algorithm multinomialLogitMix Main function multinomialLogitMix-package Clustering Multinomial Count Data under the Presence of Covariates multinomial_logistic_EM Part of the EM algorithm for multinomial logit mixture myDirichlet Simulate from the Dirichlet distribution newton_raphson_mstep M-step of the EM algorithm shakeEM_GLM Shake-small EM simulate_multinomial_data Synthetic data generator splitEM_GLM Split-small EM scheme.
See the main function of the package: multinomialLogitMix
, which wraps automatically calls to the MCMC sampler gibbs_mala_sampler_ppt
and the EM algorithm mix_mnm_logistic
.
Author(s)
NA
Maintainer: Panagiotis Papastamoulis <papapast@yahoo.gr>
References
Papastamoulis, P. Model based clustering of multinomial count data. Advances in Data Analysis and Classification (2023). https://doi.org/10.1007/s11634-023-00547-5
Papastamoulis, P. and Iliopoulos, G. (2010). An Artificial Allocations Based Solution to the Label Switching Problem in Bayesian Analysis of Mixtures of Distributions. Journal of Computational and Graphical Statistics, 19(2), 313-331. http://www.jstor.org/stable/25703571
Papastamoulis, P. (2016). label.switching: An R Package for Dealing with the Label Switching Problem in MCMC Outputs. Journal of Statistical Software, Code Snippets, 69(1), 1-24. https://doi.org/10.18637/jss.v069.c01
Rousseau, J. and Mengersen, K. (2011), Asymptotic behaviour of the posterior distribution in overfitted mixture models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73: 689-710. https://doi.org/10.1111/j.1467-9868.2011.00781.x
See Also
multinomialLogitMix
, gibbs_mala_sampler_ppt
,mix_mnm_logistic