varSelEM {MBCbook}R Documentation

A variable selection algorithm for clustering

Description

A variable selection algorithm for clustering which implements the method described in Law et al. (2004) <doi:10.1109/TPAMI.2004.71>.

Usage

varSelEM(X,G,maxit=100,eps=1e-6)

Arguments

X

a data frame containing the observations to cluster.

G

the expected number of groups (integer).

maxit

the maximum number of iterations (integer). The default value is 100.

eps

the convergence threshold. The default value is 1e-6.

Value

A list is returned with the following elements:

mu

the group means for relevant variables.

sigma

the group variances for relevant variables.

lambda

the group means for irrelevant variables

alpha

the group variances for irrelevant variables.

rho

the feature saliency.

P

the group posterior probabilities.

cls

the group memberships.

ll

the log-likelihood value.

Author(s)

C. Bouveyron

References

Law, M. H., Figueiredo, M. A. T., and Jain, A. K., Simultaneous feature selection and clustering using mixture models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, pp. 1154–1166, 2004 <doi:10.1109/TPAMI.2004.71>.

Examples

data(wine27)
X = scale(wine27[,1:27]) 
cls = wine27$Type

# Clustering and variable selection with VarSelEM
res = varSelEM(X,G=3)

# Clustering table
table(cls,res$cls)

[Package MBCbook version 0.1.2 Index]