HierarchicalClustering {sharp}R Documentation

(Weighted) hierarchical clustering

Description

Runs hierarchical clustering using implementation from hclust. If Lambda is provided, clustering is applied on the weighted distance matrix calculated using the cosa2 algorithm. Otherwise, distances are calculated using dist. This function is not using stability.

Usage

HierarchicalClustering(
  xdata,
  nc = NULL,
  Lambda = NULL,
  distance = "euclidean",
  linkage = "complete",
  ...
)

Arguments

xdata

data matrix with observations as rows and variables as columns.

nc

matrix of parameters controlling the number of clusters in the underlying algorithm specified in implementation. If nc is not provided, it is set to seq(1, tau*nrow(xdata)).

Lambda

vector of penalty parameters (see argument lambda in cosa2). Unweighted distance matrices are used if Lambda=NULL.

distance

character string indicating the type of distance to use. If Lambda=NULL, possible values include "euclidean", "maximum", "canberra", "binary", and "minkowski" (see argument method in dist). Otherwise, possible values include "euclidean" (pwr=2) or "absolute" (pwr=1) (see argument pwr in cosa2).

linkage

character string indicating the type of linkage used in hierarchical clustering to define the stable clusters. Possible values include "complete", "single" and "average" (see argument "method" in hclust for a full list). Only used if implementation=HierarchicalClustering.

...

additional parameters passed to hclust, dist, or cosa2. Parameters niter (default to 1) and noit (default to 100) correspond to the number of iterations in cosa2 to calculate weights and may need to be modified. Argument pwr in cosa2 is ignored, please provide distance instead.

Value

A list with:

comembership

an array of binary and symmetric co-membership matrices.

weights

a matrix of median weights by feature.

References

Kampert MM, Meulman JJ, Friedman JH (2017). “rCOSA: A Software Package for Clustering Objects on Subsets of Attributes.” Journal of Classification, 34(3), 514–547. doi:10.1007/s00357-017-9240-z.

Friedman JH, Meulman JJ (2004). “Clustering objects on subsets of attributes (with discussion).” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 66(4), 815-849. doi:10.1111/j.1467-9868.2004.02059.x, https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/j.1467-9868.2004.02059.x, https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-9868.2004.02059.x.

See Also

Other clustering algorithms: DBSCANClustering(), GMMClustering(), KMeansClustering(), PAMClustering()

Examples


# Data simulation
set.seed(1)
simul <- SimulateClustering(n = c(10, 10), pk = 50)

# Hierarchical clustering
myhclust <- HierarchicalClustering(
  xdata = simul$data,
  nc = seq_len(20)
)

# Weighted Hierarchical clustering (using COSA)
if (requireNamespace("rCOSA", quietly = TRUE)) {
  myhclust <- HierarchicalClustering(
    xdata = simul$data,
    weighted = TRUE,
    nc = seq_len(20),
    Lambda = c(0.2, 0.5)
  )
}

[Package sharp version 1.4.6 Index]