R: hclustcompro

hclustcompro {SPARTAAS}

R Documentation

hclustcompro

Description

Compromised Hierarchical bottom-up clustering method. The method uses two sources of information. The merging of the two data sources is done by a parameter (\alpha) that allows to weight each source.

D_\alpha = \alpha D_1 + (1-\alpha) D_2

Usage

hclustcompro(
  D1,
  D2,
  alpha="EstimateAlphaForMe",
  k=NULL,
  title="notitle",
  method="ward.D2",
  suppl_plot=TRUE
)

Arguments

`D1`	First dissimilarity matrix (square matrix) or distance matrix. Could be a contingency table (see CAdist). A factorial correspondence analysis is performed using the distances (chi-square metric).
`D2`	Second dissimilarity matrix (square matrix), same size as D1, or distance matrix.
`alpha`	The mixing parameter in order to generate the D_alpha matrix (in [0;1]). Formula: D_alpha = alpha * D1 + (1-alpha) * D2
`k`	The number of clusters you want.
`title`	The title to be displayed on the dendogram plot.
`method`	The agglomeration method to be used. This should be (an unambiguous abbreviation of) one of "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC).
`suppl_plot`	Logical defines whether additional plots are to be displayed (WSS and average sil plot).

Details

CAH
Data fusion (parameter \alpha optimal value see hclustcompro_select_alpha). It is necessary to define the appropriate proportion for each data source. This is the first sensitive point of the method that the user has to consider. A tool is provided to help him in his decision.

Cut dendrogram
The division into classes and subclasses is the second crucial point. It has to be done based on the knowledge of the study area and some decision support tools such as the cluster silhouette or the calculation of the intra-cluster variability (WSS: Within Sum of Square). You can use hclustcompro_subdivide to subdivide a cluster into sub-clusters.

Value

The function returns a list (class: hclustcompro_cl).

`D1`	First dissimilarity matrix (square matrix)
`D2`	Second dissimilarity matrix (square matrix)
`D_alpha`	The matrix use in the CAH resulting from the mixing of the two matrices (D1 and D2)
`alpha`	Alpha
`tree`	An object of class hclust, describing the tree generated by the clustering process (see hclust)
`cluster`	The cluster number vector of the selected partition
`cutree`	Plot of the cut dendrogram
`call`	How you call the function
`cont`	Original contingency data (if D1 is a contingency table)

Author(s)

The hclust function is based on Fortran code contributed to STATLIB by F. Murtagh.

A. COULON

L. BELLANGER

P. HUSI

Examples

library(SPARTAAS)
data(datangkor)

#network stratigraphic data (Network)
network <- datangkor$stratigraphy

#contingency table
cont <- datangkor$contingency

#obtain the dissimilarities matrices
distance <- CAdist(cont, nPC = 11)
constraint <- adjacency(network)

#You can also run hclustcompro with the dist matrix directly
hclustcompro(D1 = distance, D2 = constraint, alpha = 0.7, k = 4)

[Package SPARTAAS version 1.2.4 Index]