invnorm {metaRNASeq} | R Documentation |
P-value combination using the inverse normal method
Description
Combines one sided p-values using the inverse normal method.
Usage
invnorm(indpval, nrep, BHth = 0.05)
Arguments
indpval |
List of vectors of one sided p-values to be combined. |
nrep |
Vector of numbers of replicates used in each study to calculate the previous one-sided p-values. |
BHth |
Benjamini Hochberg threshold. By default, the False Discovery Rate is controlled at 5%. |
Details
For each gene g, let
N_g = \sum_{s=1}^S \omega_s \Phi^{-1}(1-p_{gs}),
where p_{gs}
corresponds to the raw p-value obtained for gene g in a differential
analysis for study s (assumed to be uniformly distributed under the null hypothesis), \Phi
the
cumulative distribution function of the standard normal distribution, and \omega_s
a set of weights.
We define the weights \omega_s
as in Marot and Mayer (2009):
\omega_s = \sqrt{\frac{\sum_c R_{cs}}{\sum_\ell \sum_c R_{c\ell}}},
where \sum_c R_{cs}
is the total number of biological replicates in study s. This allows
studies with large numbers of biological replicates to be attributed a larger weight than smaller studies.
Under the null hypothesis, the test statistic N_g
follows a N(0,1) distribution. A unilateral
test on the righthand tail of the distribution may then be performed, and classical procedures for the
correction of multiple testing, such as that of Benjamini and Hochberg (1995), may subsequently be applied to
the obtained p-values to control the false discovery rate at a desired level \alpha
.
Value
DEindices |
Indices of differentially expressed genes at the chosen Benjamini Hochberg threshold. |
TestStatistic |
Vector with test statistics for differential expression in the meta-analysis. |
rawpval |
Vector with raw p-values for differential expression in the meta-analysis. |
adjpval |
Vector with adjusted p-values for differential expression in the meta-analysis. |
Note
This function resembles the function directpvalcombi
in the metaMA R package; there is, however, one
important difference in the calculation of p-values. In particular, for microarray data, it is typically
advised to separate under- and over-expressed genes prior to the meta-analysis. In the case of RNA-seq data,
differential analyses from individual studies typically make use of negative binomial models and exact tests,
which lead to one-sided, rather than two-sided, p-values. As such, we suggest performing a meta-analysis over
the full set of genes, followed by an a posteriori check, and if necessary filter, of genes with conflicting
results (over vs. under expression) among studies.
References
Y. Benjamini and Y. Hochberg (1995). Controlling the false discovery rate: a pratical and powerful approach to multiple testing. JRSS B (57): 289-300.
Hedges, L. and Olkin, I. (1985). Statistical Methods for Meta-Analysis. Academic Press.
Marot, G. and Mayer, C.-D. (2009). Sequential analysis for microarray data based on sensitivity and meta-analysis. SAGMB 8(1): 1-33.
A. Rau, G. Marot and F. Jaffrezic (2014). Differential meta-analysis of RNA-seq data. BMC Bioinformatics 15:91
See Also
Examples
data(rawpval)
## 8 replicates simulated in each study
invnormcomb <- invnorm(rawpval,nrep=c(8,8), BHth = 0.05)
DE <- ifelse(invnormcomb$adjpval<=0.05,1,0)
hist(invnormcomb$rawpval,nclass=100)
## A more detailed example is given in the vignette of the package:
## vignette("metaRNASeq")