R: Finding communities using the Leiden algorithm

netclu_leiden {bioregion}

R Documentation

Finding communities using the Leiden algorithm

Description

This function finds communities in a (un)weighted undirected network based on the Leiden algorithm of Traag, van Eck & Waltman.

Usage

netclu_leiden(
  net,
  weight = TRUE,
  cut_weight = 0,
  index = names(net)[3],
  seed = NULL,
  objective_function = "CPM",
  resolution_parameter = 1,
  beta = 0.01,
  n_iterations = 2,
  vertex_weights = NULL,
  bipartite = FALSE,
  site_col = 1,
  species_col = 2,
  return_node_type = "both",
  algorithm_in_output = TRUE
)

Arguments

`net`	the output object from `similarity()` or `dissimilarity_to_similarity()`. If a `data.frame` is used, the first two columns represent pairs of sites (or any pair of nodes), and the next column(s) are the similarity indices.
`weight`	a `boolean` indicating if the weights should be considered if there are more than two columns.
`cut_weight`	a minimal weight value. If `weight` is TRUE, the links between sites with a weight strictly lower than this value will not be considered (O by default).
`index`	name or number of the column to use as weight. By default, the third column name of `net` is used.
`seed`	for the random number generator (NULL for random by default).
`objective_function`	a string indicating the objective function to use, the Constant Potts Model ("CPM") or "modularity" ("CPM" by default).
`resolution_parameter`	the resolution parameter to use. Higher resolutions lead to more smaller communities, while lower resolutions lead to fewer larger communities.
`beta`	parameter affecting the randomness in the Leiden algorithm. This affects only the refinement step of the algorithm.
`n_iterations`	the number of iterations to iterate the Leiden algorithm. Each iteration may improve the partition further.
`vertex_weights`	the vertex weights used in the Leiden algorithm. If this is not provided, it will be automatically determined on the basis of the objective_function. Please see the details of this function how to interpret the vertex weights.
`bipartite`	a `boolean` indicating if the network is bipartite (see Details).
`site_col`	name or number for the column of site nodes (i.e. primary nodes).
`species_col`	name or number for the column of species nodes (i.e. feature nodes).
`return_node_type`	a `character` indicating what types of nodes ("sites", "species" or "both") should be returned in the output (`return_node_type = "both"` by default).
`algorithm_in_output`	a `boolean` indicating if the original output of cluster_leiden should be returned in the output (`TRUE` by default, see Value).

Details

This function is based on the Leiden algorithm (Traag et al. 2019) as implemented in the igraph package (cluster_leiden).

Value

A list of class bioregion.clusters with five slots:

name: character containing the name of the algorithm
args: list of input arguments as provided by the user
inputs: list of characteristics of the clustering process
algorithm: list of all objects associated with the clustering procedure, such as original cluster objects (only if algorithm_in_output = TRUE)
clusters: data.frame containing the clustering results

In the algorithm slot, if algorithm_in_output = TRUE, users can find the output of cluster_leiden.

Note

Although this algorithm was not primarily designed to deal with bipartite network, it is possible to consider the bipartite network as unipartite network (bipartite = TRUE).

Do not forget to indicate which of the first two columns is dedicated to the site nodes (i.e. primary nodes) and species nodes (i.e. feature nodes) using the arguments site_col and species_col. The type of nodes returned in the output can be chosen with the argument return_node_type equal to "both" to keep both types of nodes, "sites" to preserve only the sites nodes and "species" to preserve only the species nodes.

Author(s)

Maxime Lenormand (maxime.lenormand@inrae.fr), Pierre Denelle (pierre.denelle@gmail.com) and Boris Leroy (leroy.boris@gmail.com)

References

Traag VA, Waltman L, Van Eck NJ (2019). “From Louvain to Leiden: guaranteeing well-connected communities.” Scientific reports, 9(1), 5233. Publisher: Nature Publishing Group UK London.

Examples

comat <- matrix(sample(1000, 50), 5, 10)
rownames(comat) <- paste0("Site", 1:5)
colnames(comat) <- paste0("Species", 1:10)

net <- similarity(comat, metric = "Simpson")
com <- netclu_leiden(net)

net_bip <- mat_to_net(comat, weight = TRUE)
clust2 <- netclu_leiden(net_bip, bipartite = TRUE)

[Package bioregion version 1.1.1 Index]