RobMM {RGMM}R Documentation

RobMM

Description

Robust Mixture Model

Usage

RobMM(X, nclust=2:5, model="Gaussian", ninit=10,
               nitermax=50, niterEM=50, niterMC=50, df=3,
               epsvp=10^(-4), mc_sample_size=1000, LogLike=-Inf,
               init='genie', epsPi=10^-4, epsout=-20,scale='none',
               alpha=0.75, c=ncol(X), w=2, epsilon=10^(-8),
               criterion='BIC',methodMC="RobbinsMC", par=TRUE,
               methodMCM="Weiszfeld")

Arguments

X

A matrix giving the data.

nclust

A vector of positive integers giving the possible number of clusters.

model

The mixture model. Can be 'Gaussian' (by default), 'Student' and 'Laplace'.

ninit

The number of random initisalizations. Befault is 10.

nitermax

The number of iterations for the Weiszfeld algorithm if MethodMCM= 'Weiszfeld'.

niterEM

The number of iterations for the EM algorithm.

niterMC

The number of iterations for estimating robustly the variance of each class if methodMC='FixMC' or methodMC='GradMC'.

df

The degrees of freedom for the Student law if model='Student'.

scale

Run the algorithm on scaled data if scale='robust'.

epsvp

The minimum values the estimates of the eigenvalues of the Median Covariation Matrix can take. Default is 10^-4.

mc_sample_size

The number of data generated for the Monte-Carlo method for estimating robustly the variance.

LogLike

The initial loglikelihood to "beat". Defulat is -Inf.

init

Can be F if no non random initialization of the algorithm is done, 'genie' if the algorithm is initialized with the help of the function 'genie' of the package genieclust or 'Mclust' if the initialization is done with the function hclass of the package Mclust.

epsPi

A scalar to ensure the estimates of the probabilities of belonging to a class or uniformly lower bounded by a positive constant.

epsout

If the probability of belonging of a data to a class is smaller than exp(epsout), this probbility is replaced by exp(epsout) for calculating the logLikelihood. If the probability is too weak for each class, the data is considered as an outlier. Defautl is -20.

alpha

A scalar between 1/2 and 1 used in the stepsequence for the Robbins-Monro method if methodMC='RobbinsMC'.

c

The constant in the stepsequence if methodMC='RobbinsMC' or methodMC='GradMC'.

w

The power for the weighted averaged Robbins-Monro algorithm if methodMC='RobbinsMC'.

epsilon

Stoping condition for the Weiszfeld algorithm.

criterion

The criterion for selecting the number of cluster. Can be 'ICL' (default) or 'BIC'.

methodMC

The method chosen to estimate robustly the variance. Can be 'RobbinsMC', 'GradMC' or 'FixMC'.

par

Is equal to T if the parallelization of the algorithm is allowed.

methodMCM

The method chosen for estimating the Median Covariation Matrix. Can be 'Gmedian' or 'Weiszfeld'

Value

A list with:

bestresult

A list giving all the results fo the best clustering (chosen with respect to the selected criterion.

allresults

A list containing all the results.

ICL

The ICL criterion for all the number of classes selected.

BIC

The ICL criterion for all the number of classes selected.

data

The initial data.

nclust

A vector of positive integers giving the possible number of clusters.

Kopt

The number of clusters chosen by the selected criterion.

For the lists bestresult and allresults[[k]]:

centers

A matrix whose rows are the centers of the classes.

Sigma

A matrix containing all the variance of the classes

LogLike

The final LogLikelihood.

Pi

A matrix giving the probabilities of each data to belong to each class.

niter

The number of iterations of the EM algorithm.

initEM

A vector giving the initialized clustering if init='Mclust' or init='genie'.

prop

A vector giving the proportions of each classes.

outliers

A vector giving the detected outliers.

References

Cardot, H., Cenac, P. and Zitt, P-A. (2013). Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm. Bernoulli, 19, 18-43.

Cardot, H. and Godichon-Baggioni, A. (2017). Fast Estimation of the Median Covariation Matrix with Application to Online Robust Principal Components Analysis. Test, 26(3), 461-480

Vardi, Y. and Zhang, C.-H. (2000). The multivariate L1-median and associated data depth. Proc. Natl. Acad. Sci. USA, 97(4):1423-1426.

See Also

See also Gen_MM, RMMplot and RobVar.

Examples

## Not run: 
ech <- Gen_MM(mu = matrix(c(rep(-2,3),rep(2,3),rep(0,3)),byrow = TRUE,nrow=3))
 X <- ech$X
 res <- RobMM(X , nclust=3)
 RMMplot(res,graph=c('Two_Dim'))
 
## End(Not run)

[Package RGMM version 2.1.0 Index]