| multinomialLogitMix-package {multinomialLogitMix} | R Documentation |
Clustering Multinomial Count Data under the Presence of Covariates
Description
Methods for model-based clustering of multinomial counts under the presence of covariates using mixtures of multinomial logit models, as implemented in Papastamoulis (2023) <DOI:10.1007/s11634-023-00547-5>. These models are estimated under a frequentist as well as a Bayesian setup using the Expectation-Maximization algorithm and Markov chain Monte Carlo sampling (MCMC), respectively. The (unknown) number of clusters is selected according to the Integrated Completed Likelihood criterion (for the frequentist model), and estimating the number of non-empty components using overfitting mixture models after imposing suitable sparse prior assumptions on the mixing proportions (in the Bayesian case), see Rousseau and Mengersen (2011) <DOI:10.1111/j.1467-9868.2011.00781.x>. In the latter case, various MCMC chains run in parallel and are allowed to switch states. The final MCMC output is suitably post-processed in order to undo label switching using the Equivalence Classes Representatives (ECR) algorithm, as described in Papastamoulis (2016) <DOI:10.18637/jss.v069.c01>.
Details
The DESCRIPTION file:
| Package: | multinomialLogitMix |
| Type: | Package |
| Title: | Clustering Multinomial Count Data under the Presence of Covariates |
| Version: | 1.1 |
| Date: | 2023-07-13 |
| Authors@R: | c(person(given = "Panagiotis", family = "Papastamoulis", email = "papapast@yahoo.gr", role = c( "aut", "cre"), comment = c(ORCID = "0000-0001-9468-7613"))) |
| Maintainer: | Panagiotis Papastamoulis <papapast@yahoo.gr> |
| Description: | Methods for model-based clustering of multinomial counts under the presence of covariates using mixtures of multinomial logit models, as implemented in Papastamoulis (2023) <DOI:10.1007/s11634-023-00547-5>. These models are estimated under a frequentist as well as a Bayesian setup using the Expectation-Maximization algorithm and Markov chain Monte Carlo sampling (MCMC), respectively. The (unknown) number of clusters is selected according to the Integrated Completed Likelihood criterion (for the frequentist model), and estimating the number of non-empty components using overfitting mixture models after imposing suitable sparse prior assumptions on the mixing proportions (in the Bayesian case), see Rousseau and Mengersen (2011) <DOI:10.1111/j.1467-9868.2011.00781.x>. In the latter case, various MCMC chains run in parallel and are allowed to switch states. The final MCMC output is suitably post-processed in order to undo label switching using the Equivalence Classes Representatives (ECR) algorithm, as described in Papastamoulis (2016) <DOI:10.18637/jss.v069.c01>. |
| License: | GPL-2 |
| Imports: | Rcpp (>= 1.0.8.3), MASS, doParallel, foreach, label.switching, ggplot2, coda, matrixStats, mvtnorm, RColorBrewer |
| LinkingTo: | Rcpp, RcppArmadillo |
| Author: | Panagiotis Papastamoulis [aut, cre] (<https://orcid.org/0000-0001-9468-7613>) |
Index of help topics:
dealWithLabelSwitching
Post-process the generated MCMC sample in order
to undo possible label switching.
expected_complete_LL Expected complete LL
gibbs_mala_sampler The core of the Hybrid Gibbs/MALA MCMC sampler
for the multinomial logit mixture.
gibbs_mala_sampler_ppt
Prior parallel tempering scheme of hybrid
Gibbs/MALA MCMC samplers for the multinomial
logit mixture.
log_dirichlet_pdf Log-density function of the Dirichlet
distribution
mala_proposal Proposal mechanism of the MALA step.
mixLoglikelihood_GLM Log-likelihood of the multinomial logit.
mix_mnm_logistic EM algorithm
multinomialLogitMix Main function
multinomialLogitMix-package
Clustering Multinomial Count Data under the
Presence of Covariates
multinomial_logistic_EM
Part of the EM algorithm for multinomial logit
mixture
myDirichlet Simulate from the Dirichlet distribution
newton_raphson_mstep M-step of the EM algorithm
shakeEM_GLM Shake-small EM
simulate_multinomial_data
Synthetic data generator
splitEM_GLM Split-small EM scheme.
See the main function of the package: multinomialLogitMix, which wraps automatically calls to the MCMC sampler gibbs_mala_sampler_ppt and the EM algorithm mix_mnm_logistic.
Author(s)
NA
Maintainer: Panagiotis Papastamoulis <papapast@yahoo.gr>
References
Papastamoulis, P. Model based clustering of multinomial count data. Advances in Data Analysis and Classification (2023). https://doi.org/10.1007/s11634-023-00547-5
Papastamoulis, P. and Iliopoulos, G. (2010). An Artificial Allocations Based Solution to the Label Switching Problem in Bayesian Analysis of Mixtures of Distributions. Journal of Computational and Graphical Statistics, 19(2), 313-331. http://www.jstor.org/stable/25703571
Papastamoulis, P. (2016). label.switching: An R Package for Dealing with the Label Switching Problem in MCMC Outputs. Journal of Statistical Software, Code Snippets, 69(1), 1-24. https://doi.org/10.18637/jss.v069.c01
Rousseau, J. and Mengersen, K. (2011), Asymptotic behaviour of the posterior distribution in overfitted mixture models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73: 689-710. https://doi.org/10.1111/j.1467-9868.2011.00781.x
See Also
multinomialLogitMix, gibbs_mala_sampler_ppt,mix_mnm_logistic