fisher.pvalues.support {DiscreteFDR} | R Documentation |
Computing discrete p-values and their support for binomial and Fisher's exact tests
Description
Computes discrete raw p-values and their support for binomial test or Fisher's exact test applied to 2x2 contingency tables summarizing counts coming from two categorical measurements.
Note: In future versions, this function will be removed. Generation of p-value supports for different exact tests, including Fisher's, will be moved to a separate package.
Usage
fisher.pvalues.support(counts, alternative = "greater", input = "noassoc")
Arguments
counts |
a data frame of two or four columns and any number of
lines; each line represents a 2x2 contingency table to
test. The number of columns and what they must contain
depend on the value of the |
alternative |
same argument as in stats::fisher.test. The three
possible values are |
input |
the format of the input data frame, see Details. The
three possible values are |
Details
Assume that each contingency tables compares two variables and resumes the counts of association or not with a condition. This can be resumed in the following table:
Association | No association | Total | |
Variable 1 | X_1 | Y_1 | N_1 |
Variable 2 | X_2 | Y_2 | N_2 |
Total | X_1 + X_2 | Y_1 + Y_2 | N_1 + N_2
|
If input="noassoc"
, counts
has four columns which respectively contain,
X_1
, Y_1
, X_2
and Y_2
. If input="marginal"
,
counts
has four columns which respectively contain X_1
, N_1
,
X_2
and N_2
.
If input="HG2011"
, we are in the situation of the amnesia data set as
in Heller & Gur (2011, see References). Each contingency table is obtained
from one variable which is compared to all other variables of the study. That
is, counts for "second variable" are replaced by the sum of the counts of the
other variables:
Association | No association | Total | |
Variable j | X_j | Y_j | N_j |
Variables \neq j | \sum_{i \neq j} X_i | \sum_{i \neq j} Y_i | \sum_{i \neq j} N_i |
Total | \sum X_i | \sum Y_i | \sum N_i
|
Hence counts
needs to have only two columns which respectively contain X_j
and Y_j
.
The code for the computation of the p-values of Fisher's exact test is
inspired by the example in the help page of p.discrete.adjust
of package
discreteMTP
, which is no longer available on CRAN.
See the Wikipedia article about Fisher's exact test, paragraph Example, for
a good depiction of what the code does for each possible value of
alternative
.
Value
A list of two elements:
raw |
raw discrete p-values. |
support |
a list of the supports of the CDFs of the p-values. Each support is represented by a vector in increasing order. |
References
R. Heller and H. Gur (2011). False discovery rate controlling procedures for discrete tests. arXiv preprint. arXiv:1112.4627v2.
"Fisher's exact test", Wikipedia, The Free Encyclopedia, accessed 2018-03-20, link.
See Also
Examples
X1 <- c(4, 2, 2, 14, 6, 9, 4, 0, 1)
X2 <- c(0, 0, 1, 3, 2, 1, 2, 2, 2)
N1 <- rep(148, 9)
N2 <- rep(132, 9)
Y1 <- N1 - X1
Y2 <- N2 - X2
df <- data.frame(X1, Y1, X2, Y2)
df
#Construction of the p-values and their support
df.formatted <- fisher.pvalues.support(counts = df, input = "noassoc")
raw.pvalues <- df.formatted$raw
pCDFlist <- df.formatted$support
data(amnesia)
#We only keep the first 100 lines to keep the computations fast.
#We also drop the first column to keep only columns of counts, in the Heller & Gur (2011) setting.
amnesia <- amnesia[1:100,2:3]
#Construction of the p-values and their support
amnesia.formatted <- fisher.pvalues.support(counts = amnesia, input = "HG2011")
raw.pvalues <- amnesia.formatted$raw
pCDFlist <- amnesia.formatted$support