ofim1 {mixbox}R Documentation

Computing observed Fisher information matrix for restricted finite mixture model.

Description

This function computes the observed Fisher information matrix for a given restricted finite mixture model. For this, we use the method of Basford et al. (1997). The density function of each G-component finite mixture model is given by

g({\bold{y}}|\Psi)=\sum_{g=1}^{G} \omega_{g} f_{\bold{Y}}({\bold{y}}, \Theta_g),

where \bold{\Psi} = \bigl(\bold{\Theta}_{1},\cdots, \bold{\Theta}_{G}\bigr)^{\top} with \bold{\Theta}_g=\bigl({\bold{\omega}}_g, {\bold{\mu}}_g, {{\Sigma}}_g, {\bold{\lambda}}_g\bigr)^{\top}. Herein, f_{\bold{Y}}(\bold{y}, \bold{\Theta}_g) accounts for the density function of random vector \bold{Y} within g-th component that admits the representation given by

{\bold{Y}} \mathop=\limits^d {\bold{\mu}_g}+\sqrt{W}{\bold{\lambda}_g}\vert{Z}_0\vert + \sqrt{W}{\Sigma}_{g}^{\frac{1}{2}} {\bold{Z}}_1,

where {\bold{\mu}_g} \in {R}^{d} is location vector, {\bold{\lambda}_g} \in {R}^{d} is skewness vector, \Sigma_g is a positive definite symmetric dispersion matrix for g =1,\cdots,G. Further, W is a positive random variable with mixing density function f_W(w| \bold{\theta}_g), {Z}_0\sim N(0, 1) , and {\bold{Z}}_1\sim N_{d}\bigl( {\bold{0}}, \Sigma_g \bigr) . We note that W, Z_0, and {\bold{Z}}_1 are mutually independent. For approximating the observed Fisher information matrix of the finite mixture models, we use the method of Basford et al. (1997). Based on this method, using observations {\bold{y}}=({{\bold{y}}_{1},{\bold{y}}_{2},\cdots,{\bold{y}}_{n}})^{\top}, an approximation of the expected information

-E\Bigl\{ \frac{\partial^2 \log L({\bold{\Psi}}) }{\partial \bold{\Psi} \partial \bold{\Psi}^{\top} } \Bigr\},

is give by the observed information as

\sum_{i =1}^{n} \hat{{\bold{h}}}_{i} \hat{{\bold{h}}}^{\top}_{i},

where

\hat{\bold{h}}_{i} = \frac{\partial}{\partial \bold{\Psi} } \log L_{i}(\hat{\bold{\Psi} })

and \log L(\hat{\bold{\Psi} })= \sum_{i =1}^{n} \log L_{i}(\hat{\bold{\Psi} })= \sum_{i =1}^{n} \log \Bigl\{ \sum_{g=1}^{G} \widehat{{\omega}}_g f_{\bold{Y}}\bigl({\bold{y}}_{i}|\widehat{\bold{\Theta}_g}\bigr)\Bigr\}. Herein \widehat{\omega}_g and \widehat{\bold{\Theta}_g} denote the maximum likelihood estimator of \omega_g and \bold{\Theta}_g, for g=1,\cdots,G, respectively.

Usage

 ofim1(Y, G, weight, mu, sigma, lambda, family = "constant", skewness = "FALSE",
      param = NULL, theta = NULL, tick = NULL, h = 0.001, N = 3000, level = 0.05,
    PDF = NULL )

Arguments

Y

an n \times d matrix of observations.

G

number of components.

weight

a vector of weight parameters (or mixing proportions).

mu

a list of location vectors of G components.

sigma

a list of dispersion matrices of G components.

lambda

a list of skewness vectors of G components.

family

name of the mixing distribution. By default family = "constant" that corresponds to the finite mixture of multivariate normal (or skew normal) distribution. Other candidates for family name are: "bs" (for Birnbaum-Saunders), "burriii" (for Burr type iii), "chisq" (for chi-square), "exp" (for exponential), "f" (for Fisher), "gamma" (for gamma), "gig" (for generalized inverse-Gaussian), "igamma" (for inverse-gamma), "igaussian" (for inverse-Gaussian), "lindley" (for Lindley), "loglog" (for log-logistic), "lognorm" (for log-normal), "lomax" (for Lomax), "pstable" (for positive \alpha-stable), "ptstable" (for polynomially tilted \alpha-stable), "rayleigh" (for Rayleigh), and "weibull" (for Weibull).

skewness

logical statement. By default skewness = "FALSE" which means that a symmetric model is fitted to each component (cluster). If skewness = "TRUE", then a skewed model is fitted to each component.

param

name of the elements of \theta as the parameter vector of mixing distribution with density function f_W(w|{\bold{\theta)}}. By default it is NULL.

theta

a list of maximum likelihood estimator for \theta across G components. By default it is NULL.

tick

a binary vector whose length depends on type of family. The elements of tick are either 0 or 1. If element of tick is 0, then the corresponding element of \bold{\theta} is not considered in the formula of f_W(w|{\bold{\theta)}} for computing the required posterior expectations. If element of tick is 1, then the corresponding element of \bold{\theta} is considered in the formula of f_W(w|{\bold{\theta)}}. For instance, if family = "gamma" and either its shape or rate parameter is one, then tick = c(1). This is while, if family = "gamma" and both of the shape and rate parameters are in the formula of f_W(w|{\bold{\theta)}}, then tick = c(1, 1). By default tick = NULL.

h

a positive small value for computing numerical derivative of f_W(w| \bold{\theta}) with respect to \bold{\theta}, that is \partial/ \partial \theta f_W(w| \bold{\theta}). By default h = 0.001.

N

an integer number for approximating the posterior expected values within the E-step of the EM algorithm through the Monte Carlo method. By default N = 3000.

level

significance level \alpha for constructing 100(1-\alpha)\% confidence interval. By default \alpha = 0.05.

PDF

mathematical expression for mixing density function f_W(w| \bold{\theta}). By default it is NULL.

Value

A two-part list whose first part is the observed Fisher information matrix for finite mixture model.

Author(s)

Mahdi Teimouri

References

K. E. Basford, D. R. Greenway, G. J. McLachlan, and D. Peel, (1997). Standard errors of fitted means under normal mixture, Computational Statistics, 12, 1-17.

Examples


      n <- 100
      G <- 2
 weight <- rep( 0.5, 2 )
    mu1 <- rep(-5  , 2 )
    mu2 <- rep( 5  , 2 )
 sigma1 <- matrix( c(0.4, -0.20, -0.20, 0.5 ), nrow = 2, ncol = 2 )
 sigma2 <- matrix( c(0.5,  0.20,  0.20, 0.4 ), nrow = 2, ncol = 2 )
lambda1 <- c( 5, -5 )
lambda2 <- c(-5,  5 )
     mu <- list( mu1, mu2 )
 lambda <- list( lambda1, lambda2 )
  sigma <- list( sigma1 , sigma2  )
    PDF <- quote( (b/2)^(a/2)*x^(-a/2 - 1)/gamma(a/2)*exp( -b/(x*2) ) )
  param <- c( "a","b")
 theta1 <- c( 10, 12 )
 theta2 <- c( 10, 20 )
  theta <- list( theta1, theta2 )
  tick  <- c( 1, 1 )
         Y <- rmix(n, G, weight, model = "restricted", mu, sigma, lambda, family = "igamma", theta)
      out <- ofim1(Y[, 1:2], G, weight, mu, sigma, lambda, family = "igamma", skewness = "TRUE",
            param, theta, tick, h = 0.001, N = 3000, level = 0.05, PDF)


[Package mixbox version 1.2.3 Index]