findCluster {NIMAA}R Documentation

Find clusters in projected unipartite networks

Description

This function looks for the clusters in the projected unipartite networks of the bipartite network (the incidence matrix) that was given to it.

Usage

findCluster(
  inc_mat,
  part = 1,
  method = "all",
  normalization = TRUE,
  rm_weak_edges = TRUE,
  rm_method = "delete",
  threshold = "median",
  set_remaining_to_1 = TRUE,
  extra_feature = NULL,
  comparison = TRUE
)

Arguments

inc_mat

An incidence matrix.

part

An integer, 1 or 2, indicating which unipartite projection should be used. The default is 1.

method

A string array indicating the clustering methods. The defalut is "all", which means all available clustering methods in this function are utilized. Other options are combinations of "walktrap", "multi level", "infomap", "label propagation", "leading eigenvector", "spinglass", and "fast greedy".

normalization

A logical value indicating whether edge weights should be normalized before the computation proceeds. The default is TRUE.

rm_weak_edges

A logical value indicating whether weak edges should be removed before the computation proceeds. The default is TRUE.

rm_method

A string indicating the weak edges removing method. If rm_weak_edges is False, then this argument is ignored. The default is delete, which means deleting weak edges from the network. The other option is as_zero, which sets the weak edges' weights to 0.

threshold

A string indicating the weak edge threshold selection method. If rm_weak_edges is False, then this argument is ignored. By default, median is used. The other option is keep_connected, which prevents the network from being unconnected and removes edges in ascending order of weights.

set_remaining_to_1

A logical value indicating whether the remaining edges' weight should be set to 1. The default is TRUE.

extra_feature

A data frame object that shows the group membership of each node based on prior knowledge.

comparison

A logical value indicating whether clustering methods should be compared to each other using internal measures of clustering, including modularity, average silluoutte width, and coverage. The default value is TRUE.

Details

This function performs optional preprocessing, such as normalization, on the input incidence matrix (bipartite network). The matrix is then used to perform bipartite network projection and optional preprocessing on one of the projected networks specified, such as removing edges with low weights (weak edges). Additionally, the user can specify the removal method, threshold value, or binarization of the weights. For the networks obtained after processing, this function implements some clustering methods in igraph such as "walktrap" and "infomap", to detect the communities within the network. Furthermore, if external features (prior knowledge) are provided, the function compares the clustering results obtained with the external features in terms of similarity as an external validation of clustering. Otherwise, several internal validation criteria such as modularity and coverage are only represented to compare the clustering results.

Value

A list containing the igraph object of the projected network, the clustering results of each method on the projected network separately, along with a comparison between them. The applied clustering arguments and the network's distance matrix are also included in this list for potential use in the next steps. In the case of weighted projected networks, the distance matrix is obtained by inverting the edge weights. The comparison of selected clustering methods is also presented as bar plots simultaneously.

Examples

# generate an incidence matrix
data <- matrix(c(1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0), nrow = 3)
colnames(data) <- letters[1:5]
rownames(data) <- LETTERS[1:3]

# run findCluster() to do clustering
cls <- findCluster(
  data,
  part = 1,
  method = "all",
  normalization = FALSE,
  rm_weak_edges = TRUE,
  comparison = TRUE
)

[Package NIMAA version 0.2.1 Index]