R: A variable selection algorithm for clustering

varSelEM {MBCbook}

R Documentation

A variable selection algorithm for clustering

Description

A variable selection algorithm for clustering which implements the method described in Law et al. (2004) <doi:10.1109/TPAMI.2004.71>.

Usage

varSelEM(X,G,maxit=100,eps=1e-6)

Arguments

`X`	a data frame containing the observations to cluster.
`G`	the expected number of groups (integer).
`maxit`	the maximum number of iterations (integer). The default value is 100.
`eps`	the convergence threshold. The default value is 1e-6.

Value

A list is returned with the following elements:

`mu`	the group means for relevant variables.
`sigma`	the group variances for relevant variables.
`lambda`	the group means for irrelevant variables
`alpha`	the group variances for irrelevant variables.
`rho`	the feature saliency.
`P`	the group posterior probabilities.
`cls`	the group memberships.
`ll`	the log-likelihood value.

Author(s)

C. Bouveyron

References

Law, M. H., Figueiredo, M. A. T., and Jain, A. K., Simultaneous feature selection and clustering using mixture models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, pp. 1154–1166, 2004 <doi:10.1109/TPAMI.2004.71>.

Examples

data(wine27)
X = scale(wine27[,1:27]) 
cls = wine27$Type

# Clustering and variable selection with VarSelEM
res = varSelEM(X,G=3)

# Clustering table
table(cls,res$cls)

[Package MBCbook version 0.1.2 Index]