graph_two_GOspecies {GOCompare} | R Documentation |
Undirected network representation for the results of functional enrichment analysis to compare two species and a series of categories
Description
graph_two_GOspecies is a function to create undirected graphs
The graph_two_GOspecies is an analog of the graphGOspecies function, and it has the same options (" Categories " and " GO "). Nevertheless, the way in which the edge and node weights are calculated is slightly different. Since two species are compared, three possible graphs are available \({G}_1,\, {G}_2\), and \({G}_3 \). \({G}_1\), and \({G}_2 \) represent each of the species analyzed and \({G}_3\) is a subgraph of \({G}_1,\, {G}_2\), which contains the GO terms or Categories co-ocurring between both species.
Categories option: (Weight): The nodes \((V)\) represent groups of gene lists (categories), and the edges \((E)\) represent GO terms co-occurring between pairs of categories and the weight of the nodes provides a measure of how a GO term is conserved between two species and a series of categories but it is biased to categories.
\[\widehat{K}_w(u)=\sum_{v \epsilon V_1}^{}w(u,v) + \sum_{v \epsilon V_2}^{}w(u,v)\](5)
(shared weight): The nodes \((V)\) represent groups of gene lists (categories), and the edges \((E)\) represent GO terms co-occurring between pairs of categories that are only shared between species. This node weight \({K}_s\) is computed from a shared weight of edges \({s}\), where \({N}1\) and \({N}2\) are the set of GO terms associated with the edge \(e = (u,v) \) for species 1 and 2, respectively. Therefore the node shared weight \({K}_s(u)\) is the sum of \({s}\).
\[s(e) = \frac{\mid {N1} \ n \ {N2} \mid}{\mid {N1} \bigcup {N2} \mid}\](6)
\[{K}_s(u)=\sum_{v \epsilon (V_1 \bigcup V_2) }^{}{s(u,v)}\](7)
(combined weight): This node weight \({K}_c(u)\) is a combination of the weight and the shared weight. The idea of this combined weight is to find categories with more frequent GO terms co-ocurring in order to observe functional similarities between two species with a balance of GO terms co-occurring among gene lists (categories) and the two species. This node weight varies from -1 (categories with GO terms found only in one species and few categories) to 1 (categories with GO terms shared widely between species and among other categories). the combined node weight \({K}_c\) is defined as the sum of the min-max normalized weights \(\widehat{K}_w\) and \({K}_s\) minus 1.
\[minmax(y)=\frac{y-min(y)}{max(y)-min(y)}\](8) \[{K}_c(u)= minmax(\widehat{K}_w(u)) + minmax({K}_s(u)) - 1 \] (9)
GO option: Given there are three possible graphs are available \({G}_1,\, {G}_2\), and \({G}_3\). \({G}_1\), and \({G}_2\) represent each of the species analyzed and \({G}_3\) is a subgraph of \({G}_1,\, {G}_2\), which contains the GO terms or Categories co-ocurring between both species. For this case, Nodes are GO terms and edges are categories where a GO terms is co-ocurring. This weight is similar to the GO weight calculated for graphGOspecies function. it is calculated as the equation 5.
\[\widehat{K}_w(u)=\sum_{v \epsilon V_1}^{}w(u,v) + \sum_{v \epsilon V_2}^{}w(u,v)\](5)
Usage
graph_two_GOspecies(
x,
species1,
species2,
GOterm_field,
saveGraph = FALSE,
option = "Categories",
numCores = 2,
outdir = NULL,
filename = NULL
)
Arguments
x |
is a list obtained as output of the comparegOspecies function |
species1 |
This is a string with the species name for species 1 (e.g; "H. sapiens") |
species2 |
This is a string with the species name for species 2 (e.g; "A. thaliana") |
GOterm_field |
This is a string with the column name of the GO terms (e.g; "Functional_Category") |
saveGraph |
logical, if |
option |
(values: "Categories or "GO"). This option allows create either a graph where nodes are GO terms and edges are features and GO as well as species belonging are edges attributes or a graph where nodes are GO terms and edges are species belonging (default value="Categories") |
numCores |
numeric, Number of cores to use for the process (default value numCores=2). For the example below, only one core will be used |
outdir |
This parameter will allow save the graph file in a folder described here (e.g: "D:").This parameter only works when saveGraph=TRUE |
filename |
The name of the graph filename to be saved in the outdir detailed by the user.This parameter only works when saveGraph=TRUE |
Value
This function will return a list with two slots: edges and nodes. (Categories): Edges list columns:
Column | Description |
SOURCE and TARGET | The source and target categories (Nodes in the edge) |
GO_N | The number of GO terms between the categories |
WEIGHT | Edge weight |
GO | GO terms available for both nodes |
SP1 | Number of GO terms for the species 1 |
SP2 | Number of GO terms for the species 2 |
SHARED | Number of GO terms shared or co-ocurring between the categories |
SHARED_WEIGHT | Shared weight for the edge |
Node list columns:
Column | Description |
CAT | Category name |
CAT_WEIGHT | Node weight |
SHARED_WEIGHT | Shared weight for the node |
COMBINED_WEIGHT | Combined weight for the node |
(GO):
Edges list columns:
Column | Description |
SOURCE and TARGET | The source and target GO terms (Nodes in the edge) |
FEATURE | The number of Categories where both GO Terms were found |
SP | Species where the GO terms was found (Species 1, Species 2 or Shared) |
WEIGHT | Edge weight |
Node list columns:
Column | Description |
GO | GO term node name |
GO_WEIGHT | Node weight |
Examples
GOterm_field <- "Functional_Category"
data(comparison_ex_compress_CH)
#Defining the species names
species1 <- "H. sapiens"
species2 <- "A. thaliana"
x_graph <- graph_two_GOspecies(x=comparison_ex_compress_CH,
species1=species1,
species2=species2,
GOterm_field=GOterm_field,
numCores=1,
saveGraph = FALSE,
option= "Categories",
outdir = NULL,
filename= NULL)