PlotEnrichment {myTAI} | R Documentation |
Plot the Phylostratum or Divergence Stratum Enrichment of a given Gene Set
Description
This function computes and visualizes the significance of enriched (over or underrepresented) Phylostrata or Divergence Strata within an input test.set
.
Usage
PlotEnrichment(
ExpressionSet,
test.set,
use.only.map = FALSE,
measure = "log-foldchange",
complete.bg = TRUE,
legendName = "",
over.col = "steelblue",
under.col = "midnightblue",
epsilon = 1e-05,
cex.legend = 1,
cex.asterisk = 1,
plot.bars = TRUE,
p.adjust.method = NULL,
...
)
Arguments
ExpressionSet |
a standard PhyloExpressionSet or DivergenceExpressionSet object (in case |
test.set |
a character vector storing the gene ids for which PS/DS enrichment analyses should be performed. |
use.only.map |
a logical value indicating whether instead of a standard |
measure |
a character string specifying the measure that should be used to quantify over and under representation of PS/DS. Measures can either be |
complete.bg |
a logical value indicating whether the entire background set
of the input |
legendName |
a character string specifying whether "PS" or "DS" are used to compute relative expression profiles. |
over.col |
color of the overrepresentation bars. |
under.col |
color of the underrepresentation bars. |
epsilon |
a small value to shift values by epsilon to avoid log(0) computations. |
cex.legend |
the |
cex.asterisk |
the |
plot.bars |
a logical value specifying whether or not bars should be visualized or whether only |
p.adjust.method |
correction method to adjust p-values for multiple comparisons (see |
... |
default graphics parameters. |
Details
This Phylostratum or Divergence Stratum enrichment analysis is motivated by Sestak and Domazet-Loso (2015) who perform Phylostratum or Divergence Stratum enrichment analyses to correlate organ evolution with the origin of organ specific genes.
In detail this function takes the Phylostratum or Divergence Stratum distribution of all genes stored in the input ExpressionSet
as background set and
the Phylostratum or Divergence Stratum distribution of the test.set
and performes a fisher.test
for each Phylostratum or Divergence Stratum to quantify the statistical significance of over- or underrepresentated Phylostrata or Divergence Strata within the set of selected test.set
genes.
To visualize the odds or log-odds of over or underrepresented genes within the test.set
the following procedure is performed:
N_ij denotes the number of genes in group j and deriving from PS i, with i = 1, .. , n and where j = 1 denotes the background set and j = 2 denotes the
test.set
N_i. denotes the total number of genes within PS i
N_.j denotes the total number of genes within group j
N_.. is the total number of genes within all groups j and all PS i
f_ij = N_ij / N_.. and g_ij = f_ij / f_.j denote relative frequencies between groups
f_i. denotes the between group sum of f_ij
The result is the fold-change value (odds) denoted as C = g_i2 / f_i. which is visualized above and below zero.
In case a large number of Phylostrata or Divergence Strata is included in the input
ExpressionSet
, p-values returned by PlotEnrichment
should be adjusted for
multiple comparisons which can be done by specifying the p.adjust.method
argument.
Author(s)
Hajk-Georg Drost
References
Sestak and Domazet-Loso (2015). Phylostratigraphic Profiles in Zebrafish Uncover Chordate Origins of the Vertebrate Brain. Mol. Biol. Evol. 32(2): 299-312.
See Also
Examples
data(PhyloExpressionSetExample)
set.seed(123)
test_set <- sample(PhyloExpressionSetExample[ , 2],10000)
## Examples with complete.bg = TRUE
## Hence: the entire background set of the input ExpressionSet is considered
## when performing Fisher's exact test
# measure: log-foldchange
PlotEnrichment(ExpressionSet = PhyloExpressionSetExample,
test.set = test_set ,
legendName = "PS",
measure = "log-foldchange")
# measure: foldchange
PlotEnrichment(ExpressionSet = PhyloExpressionSetExample,
test.set = test_set ,
legendName = "PS",
measure = "foldchange")
## Examples with complete.bg = FALSE
## Hence: the test.set genes are excluded from the background set before
## Fisher's exact test is performed
# measure: log-foldchange
PlotEnrichment(ExpressionSet = PhyloExpressionSetExample,
test.set = test_set ,
complete.bg = FALSE,
legendName = "PS",
measure = "log-foldchange")
# measure: foldchange
PlotEnrichment(ExpressionSet = PhyloExpressionSetExample,
test.set = test_set ,
complete.bg = FALSE,
legendName = "PS",
measure = "foldchange")