networkConcepts {WGCNA} | R Documentation |
Calculations of network concepts
Description
This functions calculates various network concepts (topological properties, network indices) of a network calculated from expression data. See details for a detailed description.
Usage
networkConcepts(datExpr, power = 1, trait = NULL, networkType = "unsigned")
Arguments
datExpr |
a data frame containg the expression data, with rows corresponding to samples and columns to genes (nodes). |
power |
soft thresholding power. |
trait |
optional specification of a sample trait. A vector of length equal the number of samples
in |
networkType |
network type. Recognized values are (unique abbreviations of) |
Details
This function computes various network concepts (also known as network statistics, topological
properties, or network indices) for a weighted correlation network. The nodes of the weighted correlation
network will be constructed between the columns (interpreted as nodes) of the input datExpr
.
If the option
networkType="unsigned"
then the adjacency between nodes i and j is defined as
A[i,j]=abs(cor(datExpr[,i],datExpr[,j]))^power
.
In the following, we use the term gene and node interchangeably since these methods were originally
developed for gene networks. The function computes the following
4 types of network concepts (introduced in Horvath and Dong 2008):
Type I: fundamental network concepts are defined as a function of the off-diagonal elements of an
adjacency matrix A and/or a node significance measure GS. These network concepts can be defined for any
network (not just correlation networks).
The adjacency matrix of an unsigned weighted correlation network is given by
A=abs(cor(datExpr,use="p"))^power
and the trait based gene significance measure is given by
GS= abs(cor(datExpr,trait, use="p"))^power
where datExpr
, trait
, power
are input parameters.
Type II: conformity-based network concepts are functions of the off-diagonal elements of the conformity
based adjacency matrix A.CF=CF*t(CF)
and/or the node significance measure. These network concepts
are
defined for any network for which a conformity vector can be defined. Details: For any adjacency matrix
A
, the conformity vector CF
is calculated by requiring that A[i,j]
is
approximately equal to CF[i]*CF[j]
.
Using the conformity one can define the matrix A.CF=CF*t(CF)
which is the outer product of
the conformity
vector with itself. In general, A.CF
is not an adjacency matrix since its diagonal elements
are different
from 1. If the off-diagonal elements of A.CF
are similar to those of A
according to the Frobenius matrix
norm, then A
is approximately factorizable. To measure the factorizability of a network, one can
calculate the Factorizability
, which is a number between 0 and 1 (Dong and Horvath 2007). T
he conformity
is defined using a monotonic, iterative algorithm that maximizes the factorizability measure.
Type III: approximate conformity based network concepts are functions of all elements of the conformity
based adjacency matrix A.CF
(including the diagonal) and/or the node significance measure
GS
. These
network concepts are very useful for deriving relationships between network concepts in networks that are
approximately factorizable.
Type IV: eigengene-based (also known as eigennode-based) network concepts are functions of the
eigengene-based adjacency matrix A.E=ConformityE*t(ConformityE)
(diagonal included) and/or the
corresponding eigengene-based gene significance measure GSE
. These network concepts can only be
defined
for correlation networks. Details: The columns (nodes) of datExpr
can be summarized with the
first principal
component, which is referred to as Eigengene in coexpression network analysis. In general correlation
networks, it is called eigennode. The eigengene-based conformity ConformityE[i]
is defined as
abs(cor(datE[,i], Eigengene))^power
where the power corresponds to the power used for defining the
weighted adjacency matrix A
. The eigengene-based conformity can also be used to define an
eigengene-based
adjacency matrix A.E=ConformityE*t(ConformityE)
.
The eigengene based factorizability EF(datE)
is a number between 0 and 1 that measures how well
A.E
approximates A
when the power parameter equals 1. EF(datE)
is defined with respect to the
singular values
of datExpr
. For a trait based node significance measure GS=abs(cor(datE,trait))^power
,
one can also define
an eigengene-based node significance measure GSE[i]=ConformityE[i]*EigengeneSignificance
where the
eigengene significance abs(cor(Eigengene,trait))^power
is defined as power of the absolute value
of the
correlation between eigengene and trait.
Eigengene-based network concepts are very useful for providing a geometric interpretation of network
concepts and for deriving relationships between network concepts. For example, the hub gene significance
measure and its eigengene-based analog have been used to characterize networks where highly connected hub
genes are important with regard to a trait based gene significance measure (Horvath and Dong 2008).
Value
A list with the following components:
Summary |
a data frame whose rows report network concepts that only depend on the adjacency matrix. Density (mean adjacency), Centralization , Heterogeneity (coefficient of variation of the connectivity), Mean ClusterCoef, Mean Connectivity. The columns of the data frame report the 4 types of network concepts mentioned in the description: Fundamental concepts, eigengene-based concepts, conformity-based concepts, and approximate conformity-based concepts. |
Size |
reports the network size, i.e. the number of nodes, which equals the number of columns of
the input data frame |
Factorizability |
a number between 0 and 1. The closer it is to 1, the better the off-diagonal
elements of the conformity based network |
Eigengene |
the first principal component of the standardized columns of |
VarExplained |
the proportion of variance explained by the first principal component (the
|
Conformity |
numerical vector giving the conformity.
The number of components of the conformity vector equals the number of columns in
|
ClusterCoef |
a numerical vector that reports the cluster coefficient for each node. This fundamental network concept measures the cliquishness of each node. |
Connectivity |
a numerical vector that reports the connectivity (also known as degree) of each
node. This fundamental network concept is also known as whole network connectivity. One can also define
the scaled connectivity |
MAR |
a numerical vector that reports the maximum adjacency ratio for each node. |
ConformityE |
a numerical vector that reports the eigengene based (aka eigenenode based)
conformity for the correlation network. The number of components equals the number of columns of
|
GS |
a numerical vector that encodes the node (gene) significance. The i-th component equals the
node significance of the i-th column of |
GSE |
a numerical vector that reports the eigengene based gene significance measure. Its i-th
component is given by |
Significance |
a data frame whose rows report network concepts that also depend on the trait based
node significance measure. The rows correspond to network concepts and the columns correspond to the type
of network concept (fundamental versus eigengene based). The first row of the data frame reports the
network significance. The fundamental version of this network concepts is the average gene
significance=mean(GS). The eigengene based analog of this concept is defined as mean(GSE). The second row
reports the hub gene significance which is defined as slope of the intercept only regression model that
regresses the gene significance on the scaled network connectivity K. The third row reports the eigengene
significance |
Author(s)
Jun Dong, Steve Horvath, Peter Langfelder
References
Bin Zhang and Steve Horvath (2005) "A General Framework for Weighted Gene Co-Expression Network Analysis", Statistical Applications in Genetics and Molecular Biology: Vol. 4: No. 1, Article 17
Dong J, Horvath S (2007) Understanding Network Concepts in Modules, BMC Systems Biology 2007, 1:24
Horvath S, Dong J (2008) Geometric Interpretation of Gene Coexpression Network Analysis. PLoS Comput Biol 4(8): e1000117
See Also
conformityBasedNetworkConcepts
for approximate conformity-based network concepts
fundamentalNetworkConcepts
for calculation of fundamental network concepts only.