R: Calculate the topological properties for a network module

networkProperties {NetRep}

R Documentation

Calculate the topological properties for a network module

Description

Calculates the network properties used to assess module preservation for one or more modules in a user specified dataset.

Usage

networkProperties(
  network,
  data,
  correlation,
  moduleAssignments = NULL,
  modules = NULL,
  backgroundLabel = "0",
  discovery = NULL,
  test = NULL,
  simplify = TRUE,
  verbose = TRUE
)

Arguments

`network`	a list of interaction networks, one for each dataset. Each entry of the list should be a `n * n` matrix or where each element contains the edge weight between nodes `i` and `j` in the inferred network for that dataset.
`data`	a list of matrices, one for each dataset. Each entry of the list should be the data used to infer the interaction `network` for that dataset. The columns should correspond to variables in the data (nodes in the network) and rows to samples in that dataset.
`correlation`	a list of matrices, one for each dataset. Each entry of the list should be a `n * n` matrix where each element contains the correlation coefficient between nodes `i` and `j` in the `data` used to infer the interaction network for that dataset.
`moduleAssignments`	a list of vectors, one for each discovery dataset, containing the module assignments for each node in that dataset.
`modules`	a list of vectors, one for each `discovery` dataset, of modules to perform the analysis on. If unspecified, all modules in each `discovery` dataset will be analysed, with the exception of those specified in `backgroundLabel` argument.
`backgroundLabel`	a single label given to nodes that do not belong to any module in the `moduleAssignments` argument. Defaults to "0". Set to `NULL` if you do not want to skip the network background module.
`discovery`	a vector of names or indices denoting the discovery dataset(s) in the `data`, `correlation`, `network`, `moduleAssignments`, `modules`, and `test` lists.
`test`	a list of vectors, one for each `discovery` dataset, of names or indices denoting the test dataset(s) in the `data`, `correlation`, and `network` lists.
`simplify`	logical; if `TRUE`, simplify the structure of the output list if possible (see Return Value).
`verbose`	logical; should progress be reported? Default is `TRUE`.

Details

Input data structures:

The preservation of network modules in a second dataset is quantified by measuring the preservation of topological properties between the discovery and test datasets. These properties are calculated not only from the interaction networks inferred in each dataset, but also from the data used to infer those networks (e.g. gene expression data) as well as the correlation structure between variables/nodes. Thus, all functions in the NetRep package have the following arguments:

network: a list of interaction networks, one for each dataset.
data: a list of data matrices used to infer those networks, one for each dataset.
correlation: a list of matrices containing the pairwise correlation coefficients between variables/nodes in each dataset.
moduleAssignments: a list of vectors, one for each discovery dataset, containing the module assignments for each node in that dataset.
modules: a list of vectors, one for each discovery dataset, containing the names of the modules from that dataset to analyse.
discovery: a vector indicating the names or indices of the previous arguments' lists to use as the discovery dataset(s) for the analyses.
test: a list of vectors, one vector for each discovery dataset, containing the names or indices of the network, data, and correlation argument lists to use as the test dataset(s) for the analysis of each discovery dataset.

The formatting of these arguments is not strict: each function will attempt to make sense of the user input. For example, if there is only one discovery dataset, then input to the moduleAssigments and test arguments may be vectors, rather than lists. If the networkProperties are being calculate within the discovery or test datasets, then the discovery and test arguments do not need to be specified, and the input matrices for the network, data, and correlation arguments do not need to be wrapped in a list.

Analysing large datasets:

Matrices in the network, data, and correlation lists can be supplied as disk.matrix objects. This class allows matrix data to be kept on disk and loaded as required by NetRep. This dramatically decreases memory usage: the matrices for only one dataset will be kept in RAM at any point in time.

Value

A nested list structure. At the top level, the list has one element per 'discovery' dataset. Each of these elements is a list that has one element per 'test' dataset analysed for that 'discovery' dataset. Each of these elements is a list that has one element per 'modules' specified. Each of these is a list containing the following objects:

'degree': The weighted within-module degree: the sum of edge weights for each node in the module.
'avgWeight': The average edge weight within the module.

If the 'data' used to infer the 'test' network is provided then the following are also returned:

'summary': A vector summarising the module across each sample. This is calculated as the first eigenvector of the module from a principal component analysis.
'contribution': The node contribution: the similarity between each node and the module summary profile ('summary').
'coherence': The proportion of module variance explained by the 'summary' vector.

When simplify = TRUE then the simplest possible structure will be returned. E.g. if the network properties are requested for only one module in only one dataset, then the returned list will have only the above elements.

When simplify = FALSE then a nested list of datasets will always be returned, i.e. each element at the top level and second level correspond to a dataset, and each element at the third level will correspond to modules discovered in the dataset specified at the top level if module labels are provided in the corresponding moduleAssignments list element. E.g. results[["Dataset1"]][["Dataset2"]][["module1"]] will contain the properties of "module1" as calculated in "Dataset2", where "module1" was indentified in "Dataset1". Modules and datasets for which calculation of the network properties have not been requested will contain NULL.

Examples

# load in example data, correlation, and network matrices for a discovery and test dataset:
data("NetRep")

# Set up input lists for each input matrix type across datasets. The list
# elements can have any names, so long as they are consistent between the
# inputs.
network_list <- list(discovery=discovery_network, test=test_network)
data_list <- list(discovery=discovery_data, test=test_data)
correlation_list <- list(discovery=discovery_correlation, test=test_correlation)
labels_list <- list(discovery=module_labels)

# Calculate the topological properties of all network modules in the discovery dataset
props <- networkProperties(
  network=network_list, data=data_list, correlation=correlation_list, 
  moduleAssignments=labels_list
)
  
# Calculate the topological properties in the test dataset for the same modules
test_props <- networkProperties(
  network=network_list, data=data_list, correlation=correlation_list,
  moduleAssignments=labels_list, discovery="discovery", test="test"
)

[Package NetRep version 1.2.7 Index]