varSelEM {MBCbook} | R Documentation |
A variable selection algorithm for clustering
Description
A variable selection algorithm for clustering which implements the method described in Law et al. (2004) <doi:10.1109/TPAMI.2004.71>.
Usage
varSelEM(X,G,maxit=100,eps=1e-6)
Arguments
X |
a data frame containing the observations to cluster. |
G |
the expected number of groups (integer). |
maxit |
the maximum number of iterations (integer). The default value is 100. |
eps |
the convergence threshold. The default value is 1e-6. |
Value
A list is returned with the following elements:
mu |
the group means for relevant variables. |
sigma |
the group variances for relevant variables. |
lambda |
the group means for irrelevant variables |
alpha |
the group variances for irrelevant variables. |
rho |
the feature saliency. |
P |
the group posterior probabilities. |
cls |
the group memberships. |
ll |
the log-likelihood value. |
Author(s)
C. Bouveyron
References
Law, M. H., Figueiredo, M. A. T., and Jain, A. K., Simultaneous feature selection and clustering using mixture models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, pp. 1154–1166, 2004 <doi:10.1109/TPAMI.2004.71>.
Examples
data(wine27)
X = scale(wine27[,1:27])
cls = wine27$Type
# Clustering and variable selection with VarSelEM
res = varSelEM(X,G=3)
# Clustering table
table(cls,res$cls)