select {mixR}R Documentation

Finite Mixture Model Selection by Information Criterion

Description

This function selects the best model from a candidate of mixture models based on the information criterion BIC.

Usage

select(
  x,
  ncomp,
  family = c("normal", "weibull", "gamma", "lnorm"),
  mstep.method = c("bisection", "newton"),
  init.method = c("kmeans", "hclust"),
  tol = 1e-06,
  max_iter = 500
)

Arguments

x

a numeric vector for raw data or a three-column matrix for the binned data

ncomp

a vector of positive integers specifying the number of components of the candidate mixture models

family

a character string specifying the family of the mixture model. It can only be one element from normal, weibull, gamma or lnorm.

mstep.method

a character string specifying the method used in M-step of the EM algorithm when fitting weibull or gamma mixture models. It can be either bisection or newton. The default is bisection.

init.method

a character string specifying the method used for providing initial values for the parameters for EM algorithm. It can be one of kmeans or hclust. The default is kmeans

tol

the tolerance for the stopping rule of EM algorithm. It is the value to stop EM algorithm when the two consecutive iterations produces loglikelihood with difference less than tol. The default value is 1e-6.

max_iter

the maximum number of iterations for the EM algorithm (default 500).

Details

By specifying different number of components, the function select fits a series of mixture models for a given family, and a mixture model with minimum value of BIC is regarded as the best.

Value

The function returns an object of class selectEM which contains the following items.

ncomp

the specified number of components of the candidate mixture models

equal.var

a logical vector indicating whether the variances of each component in each mixture model are constrained to be the same (only for normal family)

bic

the value of BIC for each mixture model

best

an indicator of the best model

family

the family of the mixture model

See Also

plot.selectEM, bs.test, mixfit

Examples

## selecting the optimal normal mixture model by BIC
set.seed(105)
x <- rmixnormal(1000, c(0.3, 0.4, 0.3), c(-4, 0, 4), c(1, 1, 1))
hist(x, breaks = 40)
ret <- select(x, ncomp = 2:5)
## [1] "The final model: normal mixture (equal variance) with 3 components"

## (not run) selecting the optimal Weibull mixture model by BIC
## set.seed(106)
## x <- rmixweibull(1000, c(0.3, 0.4, 0.3), c(2, 5, 8), c(0.7, 0.6, 1))
## ret <- select(x, ncomp = 2:5, family = "weibull")
## [1] "The final model: weibull mixture with 3 components"

## (not run) selecting the optimal Gamma mixture model by BIC
## set.seed(107)
## x <- rmixgamma(1000, c(0.3, 0.7), c(2, 5), c(0.7, 1))
## ret <- select(x, ncomp = 2:5, family = "gamma")
## [1] "The final model: gamma mixture with 2 components"


## (not run) selecting the optimal lognormal mixture model by BIC
## set.seed(108)
## x <- rmixlnorm(1000, c(0.2, 0.3, 0.2, 0.3), c(4, 7, 9, 12), c(1, 0.5, 0.7, 1))
## ret <- select(x, ncomp = 2:6, family = "lnorm")
## [1] "The final model: lnorm mixture with 4 components"


[Package mixR version 0.2.0 Index]