R: Computing Discrete P-Values and Their Supports for Fisher's...

fisher.pvalues.support {DiscreteFDR}

R Documentation

Computing Discrete P-Values and Their Supports for Fisher's Exact Test

Description

Computes discrete raw p-values and their support for Fisher's exact test applied to 2x2 contingency tables summarizing counts coming from two categorical measurements.

Note: This function is deprecated and will be removed in a future version. Please use generate.pvalues() with test.fun = DiscreteTests::fisher.test.pv and (optional) preprocess.fun = DiscreteDatasets::reconstruct_two or preprocess.fun = DiscreteDatasets::reconstruct_four instead. Alternatively, use a pipeline like
⁠data |>⁠
⁠ DiscreteDatasets::reconstruct_*(<args>) |>⁠
⁠ DiscreteTests::fisher.test.pv(<args>)⁠

Usage

fisher.pvalues.support(counts, alternative = "greater", input = "noassoc")

Arguments

`counts`	a data frame of two or four columns and any number of lines; each line represents a 2x2 contingency table to test. The number of columns and what they must contain depend on the value of the `input` argument, see Details.
`alternative`	same argument as in `stats::fisher.test()`. The three possible values are `"greater"` (default), `"two.sided"` or `"less"` and you can specify just the initial letter.
`input`	the format of the input data frame, see Details. The three possible values are `"noassoc"` (default), `"marginal"` or `"HG2011"` and you can specify just the initial letter.

Details

Assume that each contingency tables compares two variables and resumes the counts of association or not with a condition. This can be resumed in the following table:

	Association	No association	Total
Variable 1	`X_1`	`Y_1`	`N_1`
Variable 2	`X_2`	`Y_2`	`N_2`
Total	`X_1 + X_2`	`Y_1 + Y_2`	`N_1 + N_2`

If input="noassoc", counts has four columns which respectively contain, X_1, Y_1, X_2 and Y_2. If input="marginal", counts has four columns which respectively contain X_1, N_1, X_2 and N_2.

If input="HG2011", we are in the situation of the amnesia data set as in Heller & Gur (2011, see References). Each contingency table is obtained from one variable which is compared to all other variables of the study. That is, counts for "second variable" are replaced by the sum of the counts of the other variables:

	Association	No association	Total
Variable `j`	`X_j`	`Y_j`	`N_j`
Variables `\neq j`	`\sum_{i \neq j} X_i`	`\sum_{i \neq j} Y_i`	`\sum_{i \neq j} N_i`
Total	`\sum X_i`	`\sum Y_i`	`\sum N_i`

Hence counts needs to have only two columns which respectively contain X_j and Y_j.

The code for the computation of the p-values of Fisher's exact test is inspired by the example in the help page of p.discrete.adjust of package discreteMTP, which is no longer available on CRAN.

See the Wikipedia article about Fisher's exact test, paragraph Example, for a good depiction of what the code does for each possible value of alternative.

Value

A list of two elements:

`raw`	raw discrete p-values.
`support`	a list of the supports of the CDFs of the p-values. Each support is represented by a vector in increasing order.

References

R. Heller and H. Gur (2011). False discovery rate controlling procedures for discrete tests. arXiv preprint. arXiv:1112.4627v2.

"Fisher's exact test", Wikipedia, The Free Encyclopedia, accessed 2018-03-20, link.

Examples

X1 <- c(4, 2, 2, 14, 6, 9, 4, 0, 1)
X2 <- c(0, 0, 1, 3, 2, 1, 2, 2, 2)
N1 <- rep(148, 9)
N2 <- rep(132, 9)
Y1 <- N1 - X1
Y2 <- N2 - X2
df <- data.frame(X1, Y1, X2, Y2)
df

# Compute p-values and their supports of Fisher's exact test
df.formatted <- fisher.pvalues.support(counts = df, input = "noassoc")
raw.pvalues <- df.formatted$raw
pCDFlist <- df.formatted$support

[Package DiscreteFDR version 2.0.0 Index]