cut_tree {bioregion}R Documentation

Cut a hierarchical tree

Description

This functions is designed to work on a hierarchical tree and cut it at user-selected heights. It works on either outputs from hclu_hierarclust or hclust objects. It cuts the tree for the chosen number(s) of clusters or selected height(s). It also includes a procedure to automatically return the height of cut for the chosen number(s) of clusters.

Usage

cut_tree(
  tree,
  n_clust = NULL,
  cut_height = NULL,
  find_h = TRUE,
  h_max = 1,
  h_min = 0,
  dynamic_tree_cut = FALSE,
  dynamic_method = "tree",
  dynamic_minClusterSize = 5,
  dissimilarity = NULL,
  ...
)

Arguments

tree

a bioregion.hierar.tree or a hclust object

n_clust

an integer or a vector of integers indicating the number of clusters to be obtained from the hierarchical tree, or the output from partition_metrics(). Should not be used at the same time as cut_height

cut_height

a numeric vector indicating the height(s) at which the tree should be cut. Should not be used at the same time as n_clust or optim_method

find_h

a boolean indicating if the height of cut should be found for the requested n_clust

h_max

a numeric indicating the maximum possible tree height for finding the height of cut when find_h = TRUE

h_min

a numeric indicating the minimum possible height in the tree for finding the height of cut when find_h = TRUE

dynamic_tree_cut

a boolean indicating if the dynamic tree cut method should be used, in which case n_clust & cut_height are ignored

dynamic_method

a character vector indicating the method to be used to dynamically cut the tree: either "tree" (clusters searched only in the tree) or "hybrid" (clusters searched on both tree and dissimilarity matrix)

dynamic_minClusterSize

an integer indicating the minimum cluster size to use in the dynamic tree cut method (see dynamicTreeCut::cutreeDynamic())

dissimilarity

only useful if dynamic_method = "hybrid". Provide here the dissimilarity data.frame used to build the tree

...

further arguments to be passed to dynamicTreeCut::cutreeDynamic() to customize the dynamic tree cut method.

Details

The function can cut the tree with two main methods. First, it can cut the entire tree at the same height (either specified by cut_height or automatically defined for the chosen n_clust). Second, it can use the dynamic tree cut method (Langfelder et al. 2008), in which case clusters are detected with an adaptive method based on the shape of branches in the tree (thus cuts happen at multiple heights depending on cluster positions in the tree).

The dynamic tree cut method has two variants.

Value

If tree is an output from hclu_hierarclust(), then the same object is returned with content updated (i.e., args and clusters). If tree is a hclust object, then a data.frame containing the clusters is returned.

Note

The argument find_h is ignored if dynamic_tree_cut = TRUE, because heights of cut cannot be estimated in this case.

Author(s)

Pierre Denelle (pierre.denelle@gmail.com), Maxime Lenormand (maxime.lenormand@inrae.fr) and Boris Leroy (leroy.boris@gmail.com)

References

Langfelder P, Zhang B, Horvath S (2008). “Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R.” BIOINFORMATICS, 24(5), 719–720.

See Also

hclu_hierarclust

Examples

comat <- matrix(sample(0:1000, size = 500, replace = TRUE, prob = 1/1:1001),
20, 25)
rownames(comat) <- paste0("Site", 1:20)
colnames(comat) <- paste0("Species", 1:25)

simil <- similarity(comat, metric = "all")
dissimilarity <- similarity_to_dissimilarity(simil)

# User-defined number of clusters
tree1 <- hclu_hierarclust(dissimilarity, n_clust = 5)
tree2 <- cut_tree(tree1, cut_height = .05)
tree3 <- cut_tree(tree1, n_clust = c(3, 5, 10))
tree4 <- cut_tree(tree1, cut_height = c(.05, .1, .15, .2, .25))
tree5 <- cut_tree(tree1, n_clust = c(3, 5, 10), find_h = FALSE)

hclust_tree <- tree2$algorithm$final.tree
clusters_2 <- cut_tree(hclust_tree, n_clust = 10)

cluster_dynamic <- cut_tree(tree1, dynamic_tree_cut = TRUE,
                            dissimilarity = dissimilarity)


[Package bioregion version 1.1.1 Index]