adjClust {adjclust}R Documentation

Adjacency-constrained Clustering


Adjacency-constrained hierarchical agglomerative clustering


adjClust(mat, type = c("similarity", "dissimilarity"), h = ncol(mat) - 1)



A similarity matrix or a dist object. Most sparse formats from sparseMatrix are allowed


Type of matrix : similarity or dissimilarity. Defaults to "similarity"


band width. It is assumed that the similarity between two items is 0 when these items are at a distance of more than band width h. Default value is ncol(mat)-1


Adjacency-constrained hierarchical agglomerative clustering (HAC) is HAC in which each observation is associated to a position, and the clustering is constrained so as only adjacent clusters are merged. These methods are useful in various application fields, including ecology (Quaternary data) and bioinformatics (e.g., in Genome-Wide Association Studies (GWAS)).

This function is a fast implementation of the method that takes advantage of sparse similarity matrices (i.e., that have 0 entries outside of a diagonal band of width h). The method is fully described in (Dehman, 2015) and based on a kernel version of the algorithm. The different options for the implementation are available in the package vignette entitled "Notes on CHAC implementation in adjclust".


An object of class chac which describes the tree produced by the clustering process. The object a list with the same elements as an object of class chac (merge, height, order, labels, call, method, dist.method), and an extra element mat: the data on which the clustering is performed, possibly after pre-transformations described in the vignette entitled "Notes on CHAC implementation in adjclust".


Dehman A. (2015) Spatial Clustering of Linkage Disequilibrium Blocks for Genome-Wide Association Studies, PhD thesis, Universite Paris Saclay.

Ambroise C., Dehman A., Neuvial P., Rigaill G., and Vialaneix N (2019). Adjacency-constrained hierarchical clustering of a band similarity matrix with application to genomics, Algorithms for Molecular Biology 14(22)"

See Also

snpClust to cluster SNPs based on linkage disequilibrium

hicClust to cluster Hi-C data


sim <- matrix(
c(1.0, 0.1, 0.2, 0.3,
  0.1, 1.0 ,0.4 ,0.5,
  0.2, 0.4, 1.0, 0.6,
  0.3, 0.5, 0.6, 1.0), nrow = 4)

## similarity, full width
fit1 <- adjClust(sim, "similarity")

## similarity, h < p-1
fit2 <- adjClust(sim, "similarity", h = 2)

## dissimilarity
dist <- as.dist(sqrt(2-(2*sim)))

## dissimilarity, full width
fit3 <- adjClust(dist, "dissimilarity")

## dissimilarity, h < p-1
fit4 <- adjClust(dist, "dissimilarity", h = 2)

[Package adjclust version 0.5.99 Index]