G_moments {IMIFA} | R Documentation |
1st & 2nd Moments of the Pitman-Yor / Dirichlet Processes
Description
Calculate the a priori expected number of clusters (G_expected
) or the variance of the number of clusters (G_variance
) under a PYP or DP prior for a sample of size N
at given values of the concentration parameter alpha
and optionally also the Pitman-Yor discount
parameter. Useful for soliciting sensible priors (or fixed values) for alpha
or discount
under the "IMFA"
and "IMIFA"
methods for mcmc_IMIFA
. Additionally, for a given sample size N
and given expected number of clusters EG
, G_calibrate
elicits a value for the concentration parameter alpha
or the discount
parameter.
Usage
G_expected(N,
alpha,
discount = 0,
MPFR = TRUE)
G_variance(N,
alpha,
discount = 0,
MPFR = TRUE)
G_calibrate(N,
EG,
alpha = NULL,
discount = 0,
MPFR = TRUE,
...)
Arguments
N |
The sample size. |
alpha |
The concentration parameter. Must be specified (though not for |
discount |
The discount parameter for the Pitman-Yor process. Must be less than 1, but typically lies in the interval [0, 1). Defaults to 0 (i.e. the Dirichlet process). When |
MPFR |
Logical indicating whether the high-precision libraries |
EG |
The prior expected number of clusters. Must exceed |
... |
Additional arguments passed to |
Details
All arguments are vectorised. Users can also consult G_priorDensity
in order to solicit sensible priors.
For G_calibrate
, only one of alpha
or discount
can be supplied, and the function elicits a value for the opposing parameter which achieves the desired expected number of clusters EG
for the given sample size N
. By default, a value for alpha
subject to discount=0
(i.e. the Dirichlet process) is elicited. Note that alpha
may not be a positive integer multiple of discount
as it should be if discount
is negative. See Examples below.
Value
The expected number of clusters under the specified prior conditions (G_expected
), or the variance of the number of clusters (G_variance
), or the concentration parameter alpha
or discount
parameter achieving a particular expected number of clusters (G_calibrate
).
Note
G_variance
requires use of the Rmpfr
and gmp
libraries for non-zero discount
values. G_expected
requires these libraries only for the alpha=0
case. These libraries are strongly recommended (but they are not required) for G_calbirate
when discount
is non-zero, but they are required when alpha=0
is supplied. Despite the high precision arithmetic used, the functions can still be unstable for large N
and/or extreme values of alpha
and/or discount
. See the argument MPFR
.
Author(s)
Keefe Murphy - <keefe.murphy@mu.ie>
References
De Blasi, P., Favaro, S., Lijoi, A., Mena, R. H., Prunster, I., and Ruggiero, M. (2015) Are Gibbs-type priors the most natural generalization of the Dirichlet process?, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(2): 212-229.
Yamato, H. and Shibuya, M. (2000) Moments of some statistics of Pitman sampling formula, Bulletin of Informatics and Cybernetics, 32(1): 1-10.
See Also
G_priorDensity
, Rmpfr
, uniroot
Examples
# Certain examples require the use of the Rmpfr library
suppressMessages(require("Rmpfr"))
G_expected(N=50, alpha=19.23356, MPFR=FALSE)
G_variance(N=50, alpha=19.23356, MPFR=FALSE)
G_expected(N=50, alpha=c(19.23356, 12.21619, 1),
discount=c(0, 0.25, 0.7300045), MPFR=FALSE)
G_variance(N=50, alpha=c(19.23356, 12.21619, 1),
discount=c(0, 0.25, 0.7300045), MPFR=c(FALSE, TRUE, TRUE))
# Examine the growth rate of the DP
DP <- sapply(c(1, 5, 10), function(i) G_expected(1:200, alpha=i, MPFR=FALSE))
matplot(DP, type="l", xlab="N", ylab="G")
# Examine the growth rate of the PYP
PY <- sapply(c(0.25, 0.5, 0.75), function(i) G_expected(1:200, alpha=1, discount=i))
matplot(PY, type="l", xlab="N", ylab="G")
# Other special cases of the PYP are also facilitated
G_expected(N=50, alpha=c(27.1401, 0), discount=c(-27.1401/100, 0.8054448))
G_variance(N=50, alpha=c(27.1401, 0), discount=c(-27.1401/100, 0.8054448))
# Elicit values for alpha under a DP prior
G_calibrate(N=50, EG=25)
# Elicit values for alpha under a PYP prior
# G_calibrate(N=50, EG=25, discount=c(-27.1401/100, 0.25, 0.7300045))
# Elicit values for discount under a PYP prior
# G_calibrate(N=50, EG=25, alpha=c(12.21619, 1, 0), maxiter=2000)