TOMsimilarityFromExpr {WGCNA} | R Documentation |
Topological overlap matrix
Description
Calculation of the topological overlap matrix from given expression data.
Usage
TOMsimilarityFromExpr(
datExpr,
weights = NULL,
corType = "pearson",
networkType = "unsigned",
power = 6,
TOMType = "signed",
TOMDenom = "min",
maxPOutliers = 1,
quickCor = 0,
pearsonFallback = "individual",
cosineCorrelation = FALSE,
replaceMissingAdjacencies = FALSE,
suppressTOMForZeroAdjacencies = FALSE,
suppressNegativeTOM = FALSE,
useInternalMatrixAlgebra = FALSE,
nThreads = 0,
verbose = 1, indent = 0)
Arguments
datExpr |
expression data. A data frame in which columns are genes and rows ar samples. NAs are allowed, but not too many. |
weights |
optional observation weights for |
corType |
character string specifying the correlation to be used. Allowed values are (unique
abbreviations of) |
networkType |
network type. Allowed values are (unique abbreviations of) |
power |
soft-thresholding power for netwoek construction. |
TOMType |
one of |
TOMDenom |
a character string specifying the TOM variant to be used. Recognized values are
|
maxPOutliers |
only used for |
quickCor |
real number between 0 and 1 that controls the handling of missing data in the calculation of correlations. See details. |
pearsonFallback |
Specifies whether the bicor calculation, if used, should revert to Pearson when median
absolute deviation (mad) is zero. Recongnized values are (abbreviations of)
|
cosineCorrelation |
logical: should the cosine version of the correlation calculation be used? The cosine calculation differs from the standard one in that it does not subtract the mean. |
replaceMissingAdjacencies |
logical: should missing values in the calculation of adjacency be replaced by 0? |
suppressTOMForZeroAdjacencies |
Logical: should the result be set to zero for zero adjacencies? |
suppressNegativeTOM |
Logical: should the result be set to zero when negative? |
useInternalMatrixAlgebra |
Logical: should WGCNA's own, slow, matrix multiplication be used instead of R-wide BLAS? Only useful for debugging. |
nThreads |
non-negative integer specifying the number of parallel threads to be used by certain parts of correlation calculations. This option only has an effect on systems on which a POSIX thread library is available (which currently includes Linux and Mac OSX, but excludes Windows). If zero, the number of online processors will be used if it can be determined dynamically, otherwise correlation calculations will use 2 threads. |
verbose |
integer level of verbosity. Zero means silent, higher values make the output progressively more and more verbose. |
indent |
indentation for diagnostic messages. Zero means no indentation, each unit adds two spaces. |
Details
Several alternate definitions of topological overlap are available. The oldest version is now called "unsigned"; in this
version, all adjacencies are assumed to be non-negative and the topological overlap of nodes i,j
is given by
TOM_{ij} = \frac{a_{ij} + \sum_{k\neq i,j} a_{ik}a_{kj} }{f(k_i, k_j) + 1 - a_{ij}} \, ,
where the sum is over k
not equal to either i
or j
, the function f
in the denominator can be
either min or mean (goverened by argument TOMDenom
), and k_i = \sum_{j\neq i} a_{ij}
is
the connectivity of node i
. The signed versions assume that the adjacency matrix was obtained from an underlying
correlation matrix, and the element a_{ij}
carries the sign of the underlying correlation of the two
vectors. (Within WGCNA, this can really only apply to the unsigned adjacency since signed adjacencies are (essentially)
zero when the underlying correlation is negative.) The signed and signed Nowick versions are similar to the above unsigned
version, differing only in absolute
values placed in the expression: the signed Nowick expression is
TOM_{ij} = \frac{a_{ij} + \sum_{k\neq i,j} a_{ik}a_{kj} }{f(k_i, k_j) + 1 - |a_{ij}|} \, .
This TOM lies between -1 and 1, and typically is negative when the underlying adjacency is negative. The signed TOM is simply the absolute value of the signed Nowick TOM and is hence always non-negative. For non-negative adjacencies, all 3 version give the same result.
A brief note on terminology: the original article by Nowick et al use the name "weighted TO" or wTO; since all of the topological overlap versions calculated in this function are weighted, we use the name signed to indicate that this TOM keeps track of the sign of the underlying correlation.
The "2" versions of all 3 adjacency types have a somewhat different form in which the adjacency and the product are normalized separately. Thus, the "unsigned 2" version is
TOM^{(2)}_{ij} = \frac{1}{2}\left[a_{ij} + \frac{\sum_{k\neq i,j} a_{ik}a_{kj} }{f(k_i, k_j) - a_{ij}}\right] \, .
At present the relative weight of the adjacency and the normalized product term are equal and fixed; in the future a user-specified or automatically determined weight may be implemented. The "signed Nowick 2" and "signed 2" are defined analogously to their original versions. The adjacency is assumed to be signed, and the expression for "signed Nowick 2" TOM is
TOM^{(2)}_{ij} = \frac{1}{2}\left[a_{ij} + \frac{\sum_{k\neq i,j} a_{ik}a_{kj} }{f(k_i, k_j) - |a_{ij}| } \right] \, .
Analogously to "signed" TOM, "signed 2" differs from "signed Nowick 2" TOM only in taking the absolute value of the result.
At present the "2" versions should all be considered experimental and are subject to change.
Value
A matrix holding the topological overlap.
Author(s)
Peter Langfelder
References
Bin Zhang and Steve Horvath (2005) "A General Framework for Weighted Gene Co-Expression Network Analysis", Statistical Applications in Genetics and Molecular Biology: Vol. 4: No. 1, Article 17