CLV_kmeans {ClustVarLV}R Documentation

K-means algorithm for the clustering of variables

Description

K-means algorithm for the clustering of variables. Directional or local groups may be defined. Each group of variables is associated with a latent component. Moreover external information collected on the observations or on the variables may be introduced.

Usage

CLV_kmeans(
  X,
  Xu = NULL,
  Xr = NULL,
  method,
  sX = TRUE,
  sXr = FALSE,
  sXu = FALSE,
  clust,
  iter.max = 20,
  nstart = 100,
  strategy = "none",
  rho = 0.3
)

Arguments

X

The matrix of the variables to be clustered

Xu

The external variables associated with the columns of X

Xr

The external variables associated with the rows of X

method

The criterion to use in the cluster analysis.
1 or "directional" : the squared covariance is used as a measure of proximity (directional groups).
2 or "local" : the covariance is used as a measure of proximity (local groups)

sX

TRUE/FALSE : standardization or not of the columns X (TRUE by default)
(predefined -> cX = TRUE : column-centering of X)

sXr

TRUE/FALSE : standardization or not of the columns Xr (FALSE by default)
(predefined -> cXr = TRUE : column-centering of Xr)

sXu

TRUE/FALSE : standardization or not of the columns Xu (FALSE by default)
(predefined -> cXu= FALSE : no centering, Xu considered as a weight matrix)

clust

: a number i.e. the size of the partition, K, or a vector of INTEGERS i.e. the group membership of each variable in the initial partition (integer between 1 and K)

iter.max

maximal number of iteration for the consolidation (20 by default)

nstart

nb of random initialisations in the case where init is a number (100 by default)

strategy

"none" (by default), or "kplusone" (an additional cluster for the noise variables), or "sparselv" (zero loadings for the noise variables)

rho

a threshold of correlation between 0 and 1 (0.3 by default)

Details

The initalization can be made at random, repetitively, or can be defined by the user.

The parameter "strategy" makes it possible to choose a strategy for setting aside variables that do not fit into the pattern of any cluster.

Value

tabres

The value of the clustering criterion at convergence.
The percentage of the explained initial criterion value.
The number of iterations in the partitioning algorithm.

clusters

the group's membership

comp

The latent components of the clusters

loading

if there are external variables Xr or Xu : The loadings of the external variables

References

Vigneau E., Qannari E.M. (2003). Clustering of variables around latents components. Comm. Stat, 32(4), 1131-1150.

Vigneau E., Chen M., Qannari E.M. (2015). ClustVarLV: An R Package for the clustering of Variables around Latent Variables. The R Journal, 7(2), 134-148

Vigneau E., Chen M. (2016). Dimensionality reduction by clustering of variables while setting aside atypical variables. Electronic Journal of Applied Statistical Analysis, 9(1), 134-153

See Also

CLV, LCLV

Examples

data(apples_sh)
#local groups with external variables Xr 
resclvkmYX <- CLV_kmeans(X = apples_sh$pref, Xr = apples_sh$senso,method = "local",
          sX = FALSE, sXr = TRUE, clust = 2, nstart = 20)

[Package ClustVarLV version 2.1.1 Index]