clustatis {ClustBlock} | R Documentation |

Hierarchical clustering of quantitative Blocks followed by a partitioning algorithm (consolidation). Each cluster of blocks is associated with a compromise computed by the STATIS method. Moreover, a noise cluster can be set up.

clustatis(Data,Blocks,NameBlocks=NULL,Noise_cluster=FALSE,scale=FALSE, Itermax=30, Graph_dend=TRUE, Graph_bar=TRUE, printlevel=FALSE, gpmax=min(6, length(Blocks)-2), Testonlyoneclust=TRUE, alpha=0.05, nperm=50)

`Data` |
data frame or matrix. Correspond to all the blocks of variables merged horizontally |

`Blocks` |
numerical vector. The number of variables of each block. The sum must be equal to the number of columns of Data |

`NameBlocks` |
string vector. Name of each block. Length must be equal to the length of Blocks vector. If NULL, the names are B1,...Bm. Default: NULL |

`Noise_cluster` |
logical. Should a noise cluster be computed? Default: FALSE |

`scale` |
logical. Should the data variables be scaled? Default: FALSE |

`Itermax` |
numerical. Maximum of iteration for the partitioning algorithm. Default: 30 |

`Graph_dend` |
logical. Should the dendrogram be plotted? Default: TRUE |

`Graph_bar` |
logical. Should the barplot of the difference of the criterion and the barplot of the overall homogeneity at each merging step of the hierarchical algorithm be plotted? Default: TRUE |

`printlevel` |
logical. Print the number of remaining levels during the hierarchical clustering algorithm? Default: FALSE |

`gpmax` |
logical. What is maximum number of clusters to consider? Default: min(6, length(Blocks)-2) |

`Testonlyoneclust` |
logical. Test if there is more than one cluster? Default: TRUE |

`alpha` |
numerical between 0 and 1. What is the threshold to test if there is more than one cluster? Default: 0.05 |

`nperm` |
numerical. How many permutations are required to test if there is more than one cluster? Default: 50 |

Each partitionK contains a list for each number of clusters of the partition, K=1 to gpmax with:

group: the clustering partition of datasets after consolidation. If Noise_cluster=TRUE, some blocks could be in the noise cluster ("K+1")

rho: the threshold for the noise cluster

homogeneity: homogeneity index (

rv_with_compromise: RV coefficient of each block with its cluster compromise

weights: weight associated with each block in its cluster

comp_RV: RV coefficient between the compromises associated with the various clusters

compromise: the W compromise of each cluster

coord: the coordinates of objects of each cluster

inertia: percentage of total variance explained by each axis for each cluster

rv_all_cluster: the RV coefficient between each block and each cluster compromise

criterion: the CLUSTATIS criterion error

param: parameters called in the consolidation

type: parameter passed to other functions

There is also at the end of the list:

dend: The CLUSTATIS dendrogram

cutree_k: the partition obtained by cutting the dendrogram for K clusters (before consolidation).

overall_homogeneity_ng: percentage of overall homogeneity by number of clusters before consolidation (and after if there is no noise cluster)

diff_crit_ng: variation of criterion when a merging is done before consolidation (and after if there is no noise cluster)

test_one_cluster: decision and pvalue to know if there is more than one cluster

param: parameters called

type: parameter passed to other functions

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2018). Analysis and clustering of multiblock datasets by means of the STATIS and CLUSTATIS methods. Application to sensometrics. Food Quality and Preference, in Press.

Llobell, F., Vigneau, E., Qannari, E. M. (2019). Clustering datasets by means of CLUSTATIS with identification of atypical datasets. Application to sensometrics. Food Quality and Preference, 75, 97-104.

`plot.clustatis`

, `summary.clustatis`

, `clustatis_kmeans`

, `statis`

data(smoo) NameBlocks=paste0("S",1:24) cl=clustatis(Data=smoo,Blocks=rep(2,24),NameBlocks = NameBlocks) plot(cl, ngroups=3, Graph_dend=FALSE) summary(cl) #with noise cluster cl2=clustatis(Data=smoo,Blocks=rep(2,24),NameBlocks = NameBlocks, Noise_cluster=TRUE, Graph_dend=FALSE, Graph_bar=FALSE)

[Package *ClustBlock* version 2.4.0 Index]