Approximating the density function of the finite mixture models applied for model-based clustering.
The density function of a G-component finite mixture model can be represented as
where Ψ=(Θ1,⋯,ΘG)⊤ with Θg=(ωg,μg,Σg,λg)⊤. Herein, fY(y,Θg) accounts for the density function of random vector Y within each component. In the restricted case, fY(y,Θg) admits the representation given by
where μg∈Rd is location vector, λg∈Rd is skewness vector, Σg is a positive definite symmetric dispersion matrix for g=1,⋯,G. Further, W is a positive random variable with mixing density function fW(w∣θg), Z0∼N(0,1), and Z1∼Nd(0,Σg). We note that W, Z0, and Z1 are mutually independent. In the canonical or unrestricted case, fY(y,Θg) admits the representation as
where Λg is the skewness matrix and random vector Z0 follows a zero-mean normal random vector truncated to the positive hyperplane Rd whose independent marginals have variance unity. We note that in the unrestricted case Λg is a d×d diagonal matrix whereas in the canonical case, it is a d×q matrix and so, random vector Z0 follows a zero-mean normal random vector truncated to the positive hyperplane Rq.
dmix(Y, G, weight, model = "restricted", mu, sigma, lambda, family = "constant",
skewness = "FALSE", param = NULL, theta = NULL, tick = NULL, N = 3000, log = "FALSE")
an n×d matrix of observations.
number of components.
a vector of weight parameters (or mixing proportions).
it must be "canonical", "restricted", or "unrestricted". By default model = "restricted".
a list of location vectors of G components.
a list of dispersion matrices of G components.
a list of skewness vectors of G components. If model is either "canonical" or "unrestricted", then skewness vector must be given in matrix form of appropriate size.
name of mixing distribution. By default family = "constant" that corresponds to the finite mixture of multivariate normal (or skew normal) distribution. Other candidates for family name are: "bs" (for Birnbaum-Saunders), "burriii" (for Burr type iii), "chisq" (for chi-square), "exp" (for exponential), "f" (for Fisher), "gamma" (for gamma), "gig" (for generalized inverse-Gaussian), "igamma" (for inverse-gamma), "igaussian" (for inverse-Gaussian), "lindley" (for Lindley), "loglog" (for log-logistic), "lognorm" (for log-normal), "lomax" (for Lomax), "pstable" (for positive α-stable), "ptstable" (for polynomially tilted α-stable), "rayleigh" (for Rayleigh), and "weibull" (for Weibull).
a logical statement. By default skewness = "FALSE" which means that a symmetric model is fitted to each component (cluster). If skewness = "FALSE", then a skewed model is fitted to each component.
name of the elements of θ as the parameter vector of mixing distribution with density function fW(w∣θ). By default it is NULL.
a list of maximum likelihood estimator for θ (parameter vector of the mixing distribution with density function fW(w∣θ)), across G components. By default it is NULL.
a binary vector whose length depends on type of family. The elements of tick are either 0 or 1. If element of tick is 0, then the corresponding element of θ is not considered in the formula of fW(w∣θ) for computing the required posterior expectations. If element of tick is 1, then the corresponding element of θ is considered in the formula of fW(w∣θ). For instance, if family = "gamma" and either its shape or rate parameter is one, then tick = c(1). This is while, if family = "gamma" and both of the shape and rate parameters are in the formula of fW(w∣θ), then tick = c(1, 1). By default tick = NULL.
an integer number for approximating the g(y∣Ψ). By default N=3000.
if log = "TRUE", then it returns the log of the density function. By default it is log = "FALSE".
Monte Carlo approximated values of mixture model density function.