fisher.pvalues.support {DiscreteFDR}R Documentation

Computing discrete p-values and their support for binomial and Fisher's exact tests

Description

[Deprecated]

Computes discrete raw p-values and their support for binomial test or Fisher's exact test applied to 2x2 contingency tables summarizing counts coming from two categorical measurements.

Note: In future versions, this function will be removed. Generation of p-value supports for different exact tests, including Fisher's, will be moved to a separate package.

Usage

fisher.pvalues.support(counts, alternative = "greater", input = "noassoc")

Arguments

counts

a data frame of two or four columns and any number of lines; each line represents a 2x2 contingency table to test. The number of columns and what they must contain depend on the value of the input argument, see Details.

alternative

same argument as in stats::fisher.test. The three possible values are "greater" (default), "two.sided" or "less" and you can specify just the initial letter.

input

the format of the input data frame, see Details. The three possible values are "noassoc" (default), "marginal" or "HG2011" and you can specify just the initial letter.

Details

Assume that each contingency tables compares two variables and resumes the counts of association or not with a condition. This can be resumed in the following table:

Association No association Total
Variable 1 X_1 Y_1 N_1
Variable 2 X_2 Y_2 N_2
Total X_1 + X_2 Y_1 + Y_2 N_1 + N_2

If input="noassoc", counts has four columns which respectively contain, X_1, Y_1, X_2 and Y_2. If input="marginal", counts has four columns which respectively contain X_1, N_1, X_2 and N_2.

If input="HG2011", we are in the situation of the amnesia data set as in Heller & Gur (2011, see References). Each contingency table is obtained from one variable which is compared to all other variables of the study. That is, counts for "second variable" are replaced by the sum of the counts of the other variables:

Association No association Total
Variable j X_j Y_j N_j
Variables \neq j \sum_{i \neq j} X_i \sum_{i \neq j} Y_i \sum_{i \neq j} N_i
Total \sum X_i \sum Y_i \sum N_i

Hence counts needs to have only two columns which respectively contain X_j and Y_j.

The code for the computation of the p-values of Fisher's exact test is inspired by the example in the help page of p.discrete.adjust of package discreteMTP, which is no longer available on CRAN.

See the Wikipedia article about Fisher's exact test, paragraph Example, for a good depiction of what the code does for each possible value of alternative.

Value

A list of two elements:

raw

raw discrete p-values.

support

a list of the supports of the CDFs of the p-values. Each support is represented by a vector in increasing order.

References

R. Heller and H. Gur (2011). False discovery rate controlling procedures for discrete tests. arXiv preprint. arXiv:1112.4627v2.

"Fisher's exact test", Wikipedia, The Free Encyclopedia, accessed 2018-03-20, link.

See Also

fisher.test

Examples

X1 <- c(4, 2, 2, 14, 6, 9, 4, 0, 1)
X2 <- c(0, 0, 1, 3, 2, 1, 2, 2, 2)
N1 <- rep(148, 9)
N2 <- rep(132, 9)
Y1 <- N1 - X1
Y2 <- N2 - X2
df <- data.frame(X1, Y1, X2, Y2)
df

#Construction of the p-values and their support
df.formatted <- fisher.pvalues.support(counts = df, input = "noassoc")
raw.pvalues <- df.formatted$raw
pCDFlist <- df.formatted$support

data(amnesia)
#We only keep the first 100 lines to keep the computations fast.
#We also drop the first column to keep only columns of counts, in the Heller & Gur (2011) setting.
amnesia <- amnesia[1:100,2:3]

#Construction of the p-values and their support
amnesia.formatted <- fisher.pvalues.support(counts = amnesia, input = "HG2011")
raw.pvalues <- amnesia.formatted$raw
pCDFlist <- amnesia.formatted$support

[Package DiscreteFDR version 1.3.7 Index]