R: FDR Computation

p.fdr {FDRestimation}

R Documentation

FDR Computation

Description

This function computes FDRs and Method Adjusted p-values.

Usage

p.fdr(
  pvalues = NA,
  zvalues = "two.sided",
  threshold = 0.05,
  adjust.method = "BH",
  BY.corr = "positive",
  just.fdr = FALSE,
  default.odds = 1,
  estim.method = "set.pi0",
  set.pi0 = 1,
  hist.breaks = "scott",
  ties.method = "random",
  sort.results = FALSE,
  na.rm = TRUE
)

Arguments

`pvalues`	A numeric vector of raw p-values.
`zvalues`	A numeric vector of z-values to be used in pi0 estimation or a string with options "two.sided", "greater" or "less". Defaults to "two.sided".
`threshold`	A numeric value in the interval `[0,1]` used in a multiple comparison hypothesis tests to determine significance from the null. Defaults to 0.05.
`adjust.method`	A string used to identify the p-value and false discovery rate adjustment method. Defaults to `BH`. Options are `BH`, `BY`, codeBon,`Holm`, `Hoch`, and `Sidak`.
`BY.corr`	A string of either "positive" or "negative" to determine which correlation is used in the BY method. Defaults to `positive`.
`just.fdr`	A Boolean TRUE or FALSE value which output only the FDR vector instead of the list output. Defaults to FALSE.
`default.odds`	A numeric value determining the ratio of pi1/pi0 used in the computation of one FDR. Defaults to 1.
`estim.method`	A string used to determine which method is used to estimate the null proportion or pi0 value. Defaults to `set.pi0`.
`set.pi0`	A numeric value to specify a known or assumed pi0 value in the interval `[0,1]`. Defaults to 1. Which means the assumption is that all inputted raw p-values come from the null distribution.
`hist.breaks`	A numeric or string variable representing how many breaks are used in the pi0 estimation histogram methods. Defaults to "scott".
`ties.method`	A string a character string specifying how ties are treated. Options are "first", "last", "average", "min", "max", or "random". Defaults to "random".
`sort.results`	A Boolean TRUE or FALSE value which sorts the output in either increasing or non-increasing order dependent on the FDR vector. Defaults to FALSE.
`na.rm`	A Boolean TRUE or FALSE value indicating whether NA's should be removed from the inputted raw p-value vector before further computation. Defaults to TRUE.

Details

We run into errors or warnings when pvalues, zvalues, threshold, set.pi0, BY.corr, or default.odds are not inputted correctly.

Value

A list containing the following components:

`fdrs`	A numeric vector of method adjusted FDRs.
`Results Matrix`	A numeric matrix of method adjusted FDRs, method adjusted p-values, and raw p-values.
`Reject Vector`	A vector containing Reject.H0 and/or FTR.H0 based off of the threshold value and hypothesis test on the adjusted p-values.
`pi0`	A numeric value for the pi0 value used in the computations.
`threshold`	A numeric value for the threshold value used in the hypothesis tests.
`Adjustment Method`	The string with the method name used in computation(needed for the plot.fdr function).

References

Romain Francois (2014). bibtex: bibtex parser. R package version 0.4.0.

R Core Team (2016). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, https://www.R-project.org/.

Efron B (2013). Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction. Cambridge University Press. ISBN 9780511761362.

Benjamini Y, Hochberg Y (1995). “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society, 57(1), 289–300.

Shaffer JP (1995). “Multiple Hypothesis Testing.” Annual review of psychology, 46(1), 561–584.

Storey JD, Tibshirani R (2003). “Statistical significance for genomewide studies.” Proceedings of the National Academy of Sciences, 100(16), 9440–9445.

Benjamini Y, Yekutieli D (2001). “The control of the false discovery rate in multiple testing under dependency.” Annals of statistics, 1165–1188.

Meinshausen N, Rice J, others (2006). “Estimating the proportion of false null hypotheses among a large number of independently tested hypotheses.” The Annals of Statistics, 34(1), 373–393.

Jiang H, Doerge RW (2008). “Estimating the proportion of true null hypotheses for multiple comparisons.” Cancer informatics, 6, 117693510800600001.

Nettleton D, Hwang JG, Caldo RA, Wise RP (2006). “Estimating the number of true null hypotheses from a histogram of p values.” Journal of agricultural, biological, and environmental statistics, 11(3), 337.

Pounds S, Morris SW (2003). “Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of p-values.” Bioinformatics, 19(10), 1236–1242.

Holm S (1979). “A simple sequentially rejective multiple test procedure.” Scandinavian journal of statistics, 65–70.

Bonferroni C (1936). “Teoria statistica delle classi e calcolo delle probabilita.” Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commericiali di Firenze, 8, 3–62.

Hochberg Y (1988). “A sharper Bonferroni procedure for multiple tests of significance.” Biometrika, 75(4), 800–802.

Šidák Z (1967). “Rectangular confidence regions for the means of multivariate normal distributions.” Journal of the American Statistical Association, 62(318), 626–633.

Murray MH, Blume JD (2020). “False Discovery Rate Computation: Illustrations and Modifications.” 2010.04680.

Examples


# Example 1
pi0 = 0.8
pi1 = 1-pi0
n = 10000
n.0 = ceiling(n*pi0)
n.1 = n-n.0

sim.data = c(rnorm(n.1,3,1),rnorm(n.0,0,1))
sim.data.p = 2*pnorm(-abs(sim.data))

fdr.output = p.fdr(pvalues=sim.data.p, adjust.method="BH")

fdr.output$fdrs
fdr.output$pi0

# Example 2

sim.data.p = output = c(runif(800),runif(200, min=0, max=0.01))
fdr.output = p.fdr(pvalues=sim.data.p, adjust.method="Holm", sort.results = TRUE)

fdr.output$`Results Matrix`

[Package FDRestimation version 1.0.1 Index]