mcmc_IMIFA {IMIFA} | R Documentation |
Adaptive Gibbs Sampler for Nonparametric Model-based Clustering using models from the IMIFA family
Description
Carries out Gibbs sampling for all models from the IMIFA family, facilitating model-based clustering with dimensionally reduced factor-analytic covariance structures, with automatic estimation of the number of clusters and cluster-specific factors as appropriate to the method employed. Factor analysis with one group (FA/IFA), finite mixtures (MFA/MIFA), overfitted mixtures (OMFA/OMIFA), infinite factor models which employ the multiplicative gamma process (MGP) shrinkage prior (IFA/MIFA/OMIFA/IMIFA), and infinite mixtures which employ Pitman-Yor / Dirichlet Process Mixture Models (IMFA/IMIFA) are all provided.
Usage
mcmc_IMIFA(dat,
method = c("IMIFA", "IMFA",
"OMIFA", "OMFA",
"MIFA", "MFA",
"IFA", "FA",
"classify"),
range.G = NULL,
range.Q = NULL,
MGP = mgpControl(...),
BNP = bnpControl(...),
mixFA = mixfaControl(...),
alpha = NULL,
storage = storeControl(...),
...)
## S3 method for class 'IMIFA'
print(x,
...)
## S3 method for class 'IMIFA'
summary(object,
...)
Arguments
dat |
A matrix or data frame such that rows correspond to observations ( |
method |
An acronym for the type of model to fit where:
In principle, of course, one could overfit the |
range.G |
Depending on the method employed, either the range of values for the number of clusters, or the conservatively high starting value for the number of clusters. Defaults to (and must be!) For the If |
range.Q |
Depending on the method employed, either the range of values for the number of latent factors or, for methods ending in IFA, the conservatively high starting value for the number of cluster-specific factors, in which case the default starting value is For methods ending in IFA, different clusters can be modelled using different numbers of latent factors (incl. zero); for methods not ending in IFA it is possible to fit zero-factor models, corresponding to simple diagonal covariance structures. For instance, fitting the If See |
MGP |
A list of arguments pertaining to the multiplicative gamma process (MGP) shrinkage prior and adaptive Gibbs sampler (AGS). For use with the infinite factor models |
BNP |
A list of arguments pertaining to the Bayesian Nonparametric Pitman-Yor / Dirichlet process priors, for use with the infinite mixture models |
mixFA |
A list of arguments pertaining to all other aspects of model fitting, e.g. MCMC settings, cluster initialisation, and hyperparameters common to every |
alpha |
Depending on the method employed, either the hyperparameter of the Dirichlet prior for the cluster mixing proportions, or the Pitman-Yor / Dirichlet process concentration parameter. Defaults to
See |
storage |
A vector of named logical indicators governing storage of parameters of interest for all models in the IMIFA family. Defaults are set by a call to |
... |
An alternative means of passing control parameters directly via the named arguments of |
x , object |
Object of class |
Details
Creates a raw object of class "IMIFA"
from which the optimal/modal model can be extracted by get_IMIFA_results
. Dedicated print
and summary
functions exist for objects of class "IMIFA"
.
Value
A list of lists of lists of class "IMIFA"
to be passed to get_IMIFA_results
. If the returned object is x
, candidate models are accessible via subsetting, where x
is of the following form:
x[[1:length(range.G)]][[1:length(range.Q)]]
.
However, these objects of class "IMIFA" should rarely if ever be manipulated by hand - use of the get_IMIFA_results
function is strongly advised.
Note
Further control over the specification of advanced function arguments can be obtained with recourse to the following functions:
mgpControl
Supply arguments (with defaults) pertaining to the multiplicative gamma process (MGP) shrinkage prior and adaptive Gibbs sampler (AGS). For use with the infinite factor models
"IFA"
,"MIFA"
,"OMIFA"
, and"IMIFA"
only.bnpControl
Supply arguments (with defaults) pertaining to the Bayesian Nonparametric Pitman-Yor / Dirichlet process priors, for use with the infinite mixture models
"IMFA"
and"IMIFA"
. Certain arguments related to the Dirichlet concentration parameter for the overfitted mixtures"OMFA"
and"OMIFA"
can be supplied in this manner also.mixfaControl
Supply arguments (with defaults) pertaining to all other aspects of model fitting (e.g. MCMC settings, cluster initialisation, and hyperparameters common to every
method
in theIMIFA
family.storeControl
Supply logical indicators governing storage of parameters of interest for all models in the IMIFA family. It may be useful not to store certain parameters if memory is an issue (e.g. for large data sets or for a large number of MCMC iterations after burnin and thinning).
Note however that the named arguments of these functions can also be supplied directly. Parameter starting values are obtained by simulation from the relevant prior distribution specified in these control functions, though initial means and mixing proportions are computed empirically.
Author(s)
Keefe Murphy - <keefe.murphy@mu.ie>
References
Murphy, K., Viroli, C., and Gormley, I. C. (2020) Infinite mixtures of infinite factor analysers, Bayesian Analysis, 15(3): 937-963. <doi:10.1214/19-BA1179>.
Bhattacharya, A. and Dunson, D. B. (2011) Sparse Bayesian infinite factor models, Biometrika, 98(2): 291-306.
Kalli, M., Griffin, J. E. and Walker, S. G. (2011) Slice sampling mixture models, Statistics and Computing, 21(1): 93-105.
Rousseau, J. and Mengersen, K. (2011) Asymptotic Behaviour of the posterior distribution in overfitted mixture models, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(5): 689-710.
McNicholas, P. D. and Murphy, T. B. (2008) Parsimonious Gaussian mixture models, Statistics and Computing, 18(3): 285-296.
See Also
get_IMIFA_results
, mixfaControl
, mgpControl
, bnpControl
, storeControl
, Ledermann
Examples
# data(olive)
# data(coffee)
# Fit an IMIFA model to the olive data. Accept all defaults.
# simIMIFA <- mcmc_IMIFA(olive, method="IMIFA")
# summary(simIMIFA)
# Fit an IMIFA model assuming a Pitman-Yor prior.
# Control the balance between the DP and PY priors using the kappa parameter.
# simPY <- mcmc_IMIFA(olive, method="IMIFA", kappa=0.75)
# summary(simPY)
# Fit a MFA model to the scaled olive data, with isotropic uniquenesses (i.e. MPPCA).
# Allow diagonal covariance as a special case where range.Q = 0.
# Don't store the scores. Accept all other defaults.
# simMFA <- mcmc_IMIFA(olive, method="MFA", n.iters=10000, range.G=3:6, range.Q=0:3,
# score.switch=FALSE, centering=FALSE, uni.type="isotropic")
# Fit a MIFA model to the centered & scaled coffee data, w/ cluster labels initialised by K-Means.
# Note that range.Q doesn't need to be specified. Allow IFA as a special case where range.G=1.
# simMIFA <- mcmc_IMIFA(coffee, method="MIFA", n.iters=10000, range.G=1:3, z.init="kmeans")
# Fit an IFA model to the centered and pareto scaled olive data.
# Note that range.G doesn't need to be specified. We can optionally supply a range.Q starting value.
# Enforce additional shrinkage using alpha.d1, alpha.d2, prop, and eps (via mgpControl()).
# simIFA <- mcmc_IMIFA(olive, method="IFA", n.iters=10000, range.Q=4, scaling="pareto",
# alpha.d1=2.5, alpha.d2=4, prop=0.6, eps=0.12)
# Fit an OMIFA model to the centered & scaled coffee data.
# Supply a sufficiently small alpha value. Try varying other hyperparameters.
# Accept the default value for the starting number of factors,
# but supply a value for the starting number of clusters.
# Try constraining uniquenesses to be common across both variables and clusters.
# simOMIFA <- mcmc_IMIFA(coffee, method="OMIFA", range.G=10, psi.alpha=3,
# phi.hyper=c(2, 1), alpha=0.8, uni.type="single")