nClusters {fipp} | R Documentation |
Prior pmf of the number of data clusters for three model types (static/dynamic MFMs and DPM)
Description
nClusters
is a closure that returns a function which computes a table
of probability masses for specified K+s. Arguments needed for the returned
function to evaluate are: prior distribution of the number of mixture
components and its parameters (see examples for details).
Usage
nClusters(
Kplus,
N,
type = c("DPM", "static", "dynamic"),
alpha = NULL,
gamma = NULL,
maxK = NULL,
log = FALSE
)
Arguments
Kplus |
a numeric value or vector. All values must be positive integers (that is 1,2,...). It specifies the range of the number of data clusters the user wants to evaluate the prior probabilities on. |
N |
the number of observations in data |
type |
the type of model considered. Three models (static/dynamic MFMs and DPM) are supported. |
alpha , gamma |
hyperparameters for the symmetric Dirichlet prior. For static MFM, gamma should be specified, while alpha should be specified for all other models (that is, dynamic MFM and DPM). |
maxK |
the maximum number of K (= the number of mixture components) considered. Only needed for static/dynamic MFMs. |
log |
logical, indicating whether the returned probability should be logged or not |
Value
nClusters
returns a function which takes two arguments:
- priorK
a function with support on the positive integers. The function serves as a prior on K (default = NULL which is for the DPM).
- priorKparams
a named list of prior parameters for the function supplied in argument
priorK
(default = NULL which is for the DPM).
References
Greve, J., Grün, B., Malsiner-Walli, G., and Frühwirth-Schnatter, S. (2020) Spying on the Prior of the Number of Data Clusters and the Partition Distribution in Bayesian Cluster Analysis. https://arxiv.org/abs/2012.12337
Escobar, M. D., and West, M. (1995) Bayesian Density Estimation and Inference Using Mixtures. Journal of the American Statistical Association 90 (430), Taylor & Francis: 577-–88. https://www.tandfonline.com/doi/abs/10.1080/01621459.1995.10476550
Miller, J. W., and Harrison, M. T. (2018) Mixture Models with a Prior on the Number of Components. Journal of the American Statistical Association 113 (521), Taylor & Francis: 340-–56. https://www.tandfonline.com/doi/full/10.1080/01621459.2016.1255636
Frühwirth-Schnatter, S., Malsiner-Walli, G., and Grün, B. (2020) Generalized mixtures of finite mixtures and telescoping sampling https://arxiv.org/abs/2005.09918
Examples
## first, create the function pmf() for the dynamic MFM
## with N = 100, K+ evaluated between 1 and 15 with alpha = 1,
## we assume that K will be smaller than 30 by setting maxK = 30,
## please increase this value for more realistic analysis.
pmf <- nClusters(Kplus = 1:15, N = 100, type = "dynamic",
alpha = 1, maxK = 30)
## then, specifiy the prior for K so that the pmf can be evaluated
## between K+ = 1 and K+ = 15
pmf(dgeom, list(prob = 0.1))
## we can also compare this result with a different prior setting
pmf(dpois, list(lambda = 1))