dmix {mixbox}R Documentation

Approximating the density function of the finite mixture models applied for model-based clustering.

Description

The density function of a G-component finite mixture model can be represented as

g({\bold{y}}|\Psi)=\sum_{g=1}^{G} \omega_{g} f_{\bold{Y}}({\bold{y}}, \Theta_g),

where \bold{\Psi} = \bigl(\bold{\Theta}_{1},\cdots, \bold{\Theta}_{G}\bigr)^{\top} with \bold{\Theta}_g=\bigl({\bold{\omega}}_g, {\bold{\mu}}_g, {{\Sigma}}_g, {\bold{\lambda}}_g\bigr)^{\top}. Herein, f_{\bold{Y}}(\bold{y}, \bold{\Theta}_g) accounts for the density function of random vector \bold{Y} within each component. In the restricted case, f_{\bold{Y}}(\bold{y}, \bold{\Theta}_g) admits the representation given by

{\bold{Y}} \mathop=\limits^d {\bold{\mu}}_{g}+\sqrt{W}{\bold{\lambda}}_{g}\vert{Z}_0\vert + \sqrt{W}{\Sigma}_{g}^{\frac{1}{2}} {\bold{Z}}_1,

where {\bold{\mu}}_{g} \in {R}^{d} is location vector, {\bold{\lambda}}_{g} \in {R}^{d} is skewness vector, \Sigma_{g} is a positive definite symmetric dispersion matrix for g=1,\cdots,G. Further, W is a positive random variable with mixing density function f_W(w| \bold{\theta}_{g}), {Z}_0\sim N(0, 1) , and {\bold{Z}}_1\sim N_{d}\bigl( {\bold{0}}, \Sigma_{g}\bigr) . We note that W, Z_0, and {\bold{Z}}_1 are mutually independent. In the canonical or unrestricted case, f_{\bold{Y}}(\bold{y}, \bold{\Theta}_g) admits the representation as

{\bold{Y}} \mathop=\limits^d {\bold{\mu}}_{g}+\sqrt{W}{\bold{\Lambda}}_{g} \vert\bold{Z}_0\vert + \sqrt{W}{\Sigma}_{g}^{\frac{1}{2}} {\bold{Z}}_1,

where \bold{\Lambda}_{g} is the skewness matrix and random vector \bold{Z}_0 follows a zero-mean normal random vector truncated to the positive hyperplane R^{d} whose independent marginals have variance unity. We note that in the unrestricted case \bold{\Lambda}_{g} is a d \times d diagonal matrix whereas in the canonical case, it is a d\times q matrix and so, random vector \bold{Z}_0 follows a zero-mean normal random vector truncated to the positive hyperplane R^{q}.

Usage

dmix(Y, G, weight, model = "restricted", mu, sigma, lambda, family = "constant",
skewness = "FALSE", param = NULL, theta = NULL, tick = NULL, N = 3000, log = "FALSE")
 

Arguments

Y

an n\times d matrix of observations.

G

number of components.

weight

a vector of weight parameters (or mixing proportions).

model

it must be "canonical", "restricted", or "unrestricted". By default model = "restricted".

mu

a list of location vectors of G components.

sigma

a list of dispersion matrices of G components.

lambda

a list of skewness vectors of G components. If model is either "canonical" or "unrestricted", then skewness vector must be given in matrix form of appropriate size.

family

name of mixing distribution. By default family = "constant" that corresponds to the finite mixture of multivariate normal (or skew normal) distribution. Other candidates for family name are: "bs" (for Birnbaum-Saunders), "burriii" (for Burr type iii), "chisq" (for chi-square), "exp" (for exponential), "f" (for Fisher), "gamma" (for gamma), "gig" (for generalized inverse-Gaussian), "igamma" (for inverse-gamma), "igaussian" (for inverse-Gaussian), "lindley" (for Lindley), "loglog" (for log-logistic), "lognorm" (for log-normal), "lomax" (for Lomax), "pstable" (for positive \alpha-stable), "ptstable" (for polynomially tilted \alpha-stable), "rayleigh" (for Rayleigh), and "weibull" (for Weibull).

skewness

a logical statement. By default skewness = "FALSE" which means that a symmetric model is fitted to each component (cluster). If skewness = "FALSE", then a skewed model is fitted to each component.

param

name of the elements of \bold{\theta} as the parameter vector of mixing distribution with density function f_W(w| \bold{\theta}). By default it is NULL.

theta

a list of maximum likelihood estimator for \bold{\theta} (parameter vector of the mixing distribution with density function f_W(w| \bold{\theta})), across G components. By default it is NULL.

tick

a binary vector whose length depends on type of family. The elements of tick are either 0 or 1. If element of tick is 0, then the corresponding element of \bold{\theta} is not considered in the formula of f_W(w|{\bold{\theta)}} for computing the required posterior expectations. If element of tick is 1, then the corresponding element of \bold{\theta} is considered in the formula of f_W(w|{\bold{\theta)}}. For instance, if family = "gamma" and either its shape or rate parameter is one, then tick = c(1). This is while, if family = "gamma" and both of the shape and rate parameters are in the formula of f_W(w|{\bold{\theta)}}, then tick = c(1, 1). By default tick = NULL.

N

an integer number for approximating the g({\bold{y}}|\Psi) . By default N = 3000.

log

if log = "TRUE", then it returns the log of the density function. By default it is log = "FALSE".

Value

Monte Carlo approximated values of mixture model density function.

Author(s)

Mahdi Teimouri

Examples


      Y <- c(1, 2)
      G <- 2
 weight <- rep( 0.5, 2 )
    mu1 <- rep(  -5, 2 )
    mu2 <- rep(   5, 2 )
 sigma1 <- matrix( c( 0.4, -0.20, -0.20, 0.5 ), nrow = 2, ncol = 2 )
 sigma2 <- matrix( c( 0.5,  0.20,  0.20, 0.4 ), nrow = 2, ncol = 2 )
lambda1 <- c( 5, -5 )
lambda2 <- c(-5,  5 )
     mu <- list( mu1, mu2 )
  sigma <- list( sigma1 , sigma2 )
 lambda <- list( lambda1, lambda2)
    out <- dmix(Y, G, weight, model = "restricted", mu, sigma, lambda, family =
           "constant", skewness = "TRUE", param = NULL, theta = NULL, tick =
           NULL, N = 3000)


[Package mixbox version 1.2.3 Index]