norMmixMLE {norMmix}R Documentation

Maximum Likelihood Estimation for Multivariate Normal Mixtures

Description

Direct Maximum Likelihood Estimation (MLE) for multivariate normal mixture models "norMmix". Starting from a clara (package cluster) clustering plus one M-step by default, or alternatively from the default start of (package) mclust, perform direct likelihood maximization via optim().

Usage

norMmixMLE(x, k,
           model = c("EII", "VII", "EEI", "VEI", "EVI",
                     "VVI", "EEE", "VEE", "EVV", "VVV"),
           initFUN = claraInit,
           ll = c("nmm", "mvt"),
           keep.optr = TRUE, keep.data = keep.optr,
           method = "BFGS", maxit = 100, trace = 2,
           optREPORT = 10, reltol = sqrt(.Machine$double.eps),
 	   ...)

claraInit(x, k, samples = 128,
          sampsize = ssClara2kL, trace)
mclVVVinit(x, k, ...)

ssClara2kL(n, k, p)

Arguments

x

numeric [n x p] matrix

k

positive number of components

model

a character string, specifying the model (for the k covariance matrices) to be assumed.

initFUN

a function, that takes arguments x and k and returns a clustering index; a vector of length p = ncol(x), with entries in 1:k.

ll

a string specifying the method to be used for the likelihood computation; the default, "nmm" uses llnorMmix(), whereas "mvt" uses llmvtnorm() which is based on the MV normal density from package mvtnorm.

keep.optr, keep.data

logical, each indicating of the optimization result (from optim(), currently), or the data x respectively, should be saved as part of the result (function ‘value’, see also below).

method, maxit, optREPORT, reltol

arguments for tuning the optimizer optim(*, method=method, control = list(...)).

trace
in norMmixMLE():

passed to optim(*, control=..), see above.

in claraInit():

a non-negative integer indicating how much clara() calls should be traced.

...
in norMmixMLE():

passed to optim(*, control=..), see above.

in mclVVVinit():

further arguments passed to (package mclust) function hcVVV().

samples

the number of subsamples to take in clara(), package cluster, see its help.

sampsize

the sample size to take in clara(), package cluster. Here, can be a positive integer or, as by default, a function with arguments (n,k,p).

n, p

matrix dimensions nrow(x) and ncol(x).

Details

By default, initFUN=claraInit, uses clara() and one M-step from EM-algorithm to initialize parameters after that uses general optimizer optim() to calculate the MLE.

Value

norMmixMLE returns an object of class "norMmixMLE" which is a list with components

norMmix

the "norMmix" object corresponding to the specified model and the fitted (MLE) parameter vector.

optr

(if keep.optr is true:) the [r]eturn value of optimization, i.e., currently, optim().

npar

the number of free parameters, a function of (p, k, model).

n

the sample size, i.e., the number of observations or rows of x.

cond

the result of (the hidden function) parcond(..), that is the ratio of sample size over parameter count.

x

(if keep.optr is true:) the n \times p data matrix.

Examples

MW214
set.seed(105)
x <- rnorMmix(1000, MW214)

## Fitting, assuming to know the true model (k=6, "VII")
fm1  <- norMmixMLE(x, k = 6, model = "VII", initFUN=claraInit)
fm1 # {using print.norMmixMLE() method}
fm1M <- norMmixMLE(x, k = 6, model = "VII", initFUN=mclVVVinit)

## Fitting "wrong" overparametrized model: typically need more iterations:
fmW <- norMmixMLE(x, k = 7, model = "VVV", maxit = 200, initFUN=claraInit)
## default maxit=100 is often too small    ^^^^^^^^^^^


x <- rnorMmix(2^12, MW51)
fM5 <- norMmixMLE(x, k = 4) # k = 3 is sufficient
fM5
c(logLik = logLik(fM5), AIC = AIC(fM5), BIC = BIC(fM5))
plot(fM5, show.x=FALSE)
plot(fM5, lwd=3, pch.data=".")

# this takes several seconds
 fM5big <- norMmixMLE(x, model = "VVV", k = 4, maxit = 300) # k = 3 is sufficient
 summary(warnings())
 fM5big ; c(logLik = logLik(fM5big), AIC = AIC(fM5big), BIC = BIC(fM5big))
 plot(fM5big, show.x=FALSE)


[Package norMmix version 0.1-1 Index]