hclustcompro {SPARTAAS}R Documentation

hclustcompro

Description

Compromised Hierarchical bottom-up clustering method. The method uses two sources of information. The merging of the two data sources is done by a parameter (α\alpha) that allows to weight each source.

Dα=αD1+(1α)D2D_\alpha = \alpha D_1 + (1-\alpha) D_2


Usage

hclustcompro(
  D1,
  D2,
  alpha="EstimateAlphaForMe",
  k=NULL,
  title="notitle",
  method="ward.D2",
  suppl_plot=TRUE
)

Arguments

D1

First dissimilarity matrix (square matrix) or distance matrix. Could be a contingency table (see CAdist). A factorial correspondence analysis is performed using the distances (chi-square metric).

D2

Second dissimilarity matrix (square matrix), same size as D1, or distance matrix.

alpha

The mixing parameter in order to generate the D_alpha matrix (in [0;1]). Formula: D_alpha = alpha * D1 + (1-alpha) * D2

k

The number of clusters you want.

title

The title to be displayed on the dendogram plot.

method

The agglomeration method to be used. This should be (an unambiguous abbreviation of) one of "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC).

suppl_plot

Logical defines whether additional plots are to be displayed (WSS and average sil plot).

Details

CAH
Data fusion (parameter α\alpha optimal value see hclustcompro_select_alpha). It is necessary to define the appropriate proportion for each data source. This is the first sensitive point of the method that the user has to consider. A tool is provided to help him in his decision.

Cut dendrogram
The division into classes and subclasses is the second crucial point. It has to be done based on the knowledge of the study area and some decision support tools such as the cluster silhouette or the calculation of the intra-cluster variability (WSS: Within Sum of Square). You can use hclustcompro_subdivide to subdivide a cluster into sub-clusters.

Value

The function returns a list (class: hclustcompro_cl).

D1

First dissimilarity matrix (square matrix)

D2

Second dissimilarity matrix (square matrix)

D_alpha

The matrix use in the CAH resulting from the mixing of the two matrices (D1 and D2)

alpha

Alpha

tree

An object of class hclust, describing the tree generated by the clustering process (see hclust)

cluster

The cluster number vector of the selected partition

cutree

Plot of the cut dendrogram

call

How you call the function

cont

Original contingency data (if D1 is a contingency table)

Author(s)

The hclust function is based on Fortran code contributed to STATLIB by F. Murtagh.

A. COULON

L. BELLANGER

P. HUSI

Examples

library(SPARTAAS)
data(datangkor)

#network stratigraphic data (Network)
network <- datangkor$stratigraphy

#contingency table
cont <- datangkor$contingency

#obtain the dissimilarities matrices
distance <- CAdist(cont, nPC = 11)
constraint <- adjacency(network)

#You can also run hclustcompro with the dist matrix directly
hclustcompro(D1 = distance, D2 = constraint, alpha = 0.7, k = 4)

[Package SPARTAAS version 1.2.4 Index]