mix_mode {BayesMultiMode} | R Documentation |
Mode estimation
Description
Mode estimation in univariate mixture distributions. The fixed-point algorithm of Carreira-Perpinan (2000) is used for Gaussian mixtures. The Modal EM algorithm of Li et al. (2007) is used for other continuous mixtures. A basic algorithm is used for discrete mixtures, see Cross et al. (2024).
Usage
mix_mode(
mixture,
tol_mixp = 0,
tol_x = 1e-06,
tol_conv = 1e-08,
type = "all",
inside_range = TRUE
)
Arguments
mixture |
An object of class |
tol_mixp |
Components with a mixture proportion below |
tol_x |
(for continuous mixtures) Tolerance parameter for distance in-between modes; default is |
tol_conv |
(for continuous mixtures) Tolerance parameter for convergence of the algorithm; default is |
type |
(for discrete mixtures) Type of modes, either |
inside_range |
Should modes outside of |
Details
This function finds modes in a univariate mixture defined as:
p(.) = \sum_{k=1}^{K}\pi_k p_k(.),
where p_k
is a density or probability mass/density function.
Fixed-point algorithm
Following Carreira-Perpinan (2000), a mode x
is found by iterating the two steps:
(i) \quad p(k|x^{(n)}) = \frac{\pi_k p_k(x^{(n)})}{p(x^{(n)})},
(ii) \quad x^{(n+1)} = f(x^{(n)}),
with
f(x) = (\sum_k p(k|x) \sigma_k)^{-1}\sum_k p(k|x) \sigma_k \mu_k,
until convergence, that is, until abs(x^{(n+1)}-x^{(n)})< \text{tol}_\text{conv}
,
where \text{tol}_\text{conv}
is an argument with default value 1e-8
.
Following Carreira-perpinan (2000), the algorithm is started at each component location.
Separately, it is necessary to identify identical modes which diverge only up to
a small value; this tolerance value can be controlled with the argument
tol_x
.
MEM algorithm
Following Li et al. (2007), a mode x
is found by iterating the two steps:
(i) \quad p(k|x^{(n)}) = \frac{\pi_k p_k(x^{(n)})}{p(x^{(n)})},
(ii) \quad x^{(n+1)} = \text{argmax}_x \sum_k p(k|x) \text{log} p_k(x^{(n)}),
until convergence, that is, until abs(x^{(n+1)}-x^{(n)})< \text{tol}_\text{conv}
,
where \text{tol}_\text{conv}
is an argument with default value 1e-8
.
The algorithm is started at each component location.
Separately, it is necessary to identify identical modes which diverge only up to
a small value. Modes which are closer then tol_x
are merged.
Discrete method By definition, modes must satisfy either:
p(y_{m}-1) < p(y_{m}) > p(y_{m}+1);
p(y_{m}-1) < p(y_{m}) = p(y_{m}+1) = \ldots = p(y_{m}+l-1) > p(y_{m}+l).
The algorithm evaluate each location point with these two conditions.
Value
A list of class mix_mode
containing:
mode_estimates |
estimates of the mixture modes. |
algo |
algorithm used for mode estimation. |
dist |
from |
dist_type |
type of mixture distribution, i.e. continuous or discrete. |
pars |
from |
pdf_func |
from |
K |
from |
nb_var |
from |
References
Cross JL, Hoogerheide L, Labonne P, van Dijk HK (2024). “Bayesian mode inference for discrete distributions in economics and finance.” Economics Letters, 235, 111579. ISSN 0165-1765, doi:10.1016/j.econlet.2024.111579.
Carreira-Perpinan MA (2000).
“Mode-finding for mixtures of Gaussian distributions.”
IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11), 1318–1323.
ISSN 1939-3539, doi:10.1109/34.888716, Conference Name: IEEE Transactions on Pattern Analysis and Machine Intelligence.
Cross JL, Hoogerheide L, Labonne P, van Dijk HK (2024).
“Bayesian mode inference for discrete distributions in economics and finance.”
Economics Letters, 235, 111579.
ISSN 0165-1765, doi:10.1016/j.econlet.2024.111579.
Li J, Ray S, Lindsay BG (2007).
“A Nonparametric Statistical Approach to Clustering via Mode Identification.”
Journal of Machine Learning Research, 8, 1687-1723.
Examples
# Example with a normal distribution ====================================
mu = c(0,5)
sigma = c(1,2)
p = c(0.5,0.5)
params = c(eta = p, mu = mu, sigma = sigma)
mix = mixture(params, dist = "normal", range = c(-5,15))
modes = mix_mode(mix)
# summary(modes)
# plot(modes)
# Example with a skew normal =============================================
xi = c(0,6)
omega = c(1,2)
alpha = c(0,0)
p = c(0.8,0.2)
params = c(eta = p, xi = xi, omega = omega, alpha = alpha)
dist = "skew_normal"
mix = mixture(params, dist = dist, range = c(-5,15))
modes = mix_mode(mix)
# summary(modes)
# plot(modes)
# Example with an arbitrary continuous distribution ======================
xi = c(0,6)
omega = c(1,2)
alpha = c(0,0)
nu = c(3,100)
p = c(0.8,0.2)
params = c(eta = p, mu = xi, sigma = omega, xi = alpha, nu = nu)
pdf_func <- function(x, pars) {
sn::dst(x, pars["mu"], pars["sigma"], pars["xi"], pars["nu"])
}
mix = mixture(params, pdf_func = pdf_func,
dist_type = "continuous", loc = "mu", range = c(-5,15))
modes = mix_mode(mix)
# summary(modes)
# plot(modes, from = -4, to = 4)
# Example with a poisson distribution ====================================
lambda = c(0.1,10)
p = c(0.5,0.5)
params = c(eta = p, lambda = lambda)
dist = "poisson"
mix = mixture(params, range = c(0,50), dist = dist)
modes = mix_mode(mix)
# summary(modes)
# plot(modes)
# Example with an arbitrary discrete distribution =======================
mu = c(20,5)
size = c(20,0.5)
p = c(0.5,0.5)
params = c(eta = p, mu = mu, size = size)
pmf_func <- function(x, pars) {
dnbinom(x, mu = pars["mu"], size = pars["size"])
}
mix = mixture(params, range = c(0, 50),
pdf_func = pmf_func, dist_type = "discrete")
modes = mix_mode(mix)
# summary(modes)
# plot(modes)