fisher_test_pv {DiscreteTests} | R Documentation |
Fisher's Exact Test for Count Data
Description
fisher_test_pv()
performs Fisher's exact test or a chi-square approximation
to assess if rows and columns of a 2-by-2 contingency table with fixed
marginals are independent. In contrast to stats::fisher.test()
, it is
vectorised, only calculates p-values and offers a normal approximation of
their computation. Furthermore, it is capable of returning the discrete
p-value supports, i.e. all observable p-values under a null hypothesis.
Multiple tables can be analysed simultaneously. In two-sided tests, several
procedures of obtaining the respective p-values are implemented.
Note: Please use fisher_test_pv()
! The older fisher.test.pv()
is
deprecated in order to migrate to snake case. It will be removed in a future
version.
Usage
fisher_test_pv(
x,
alternative = "two.sided",
ts_method = "minlike",
exact = TRUE,
correct = TRUE,
simple_output = FALSE
)
fisher.test.pv(
x,
alternative = "two.sided",
ts.method = "minlike",
exact = TRUE,
correct = TRUE,
simple.output = FALSE
)
Arguments
x |
integer vector with four elements, a 2-by-2 matrix or an integer matrix (or data frame) with four columns, where each line represents a 2-by-2 table to be tested. |
alternative |
character vector that indicates the alternative hypotheses; each value must be one of |
ts_method , ts.method |
single character string that indicates the two-sided p-value computation method (if any value in |
exact |
logical value that indicates whether p-values are to be calculated by exact computation ( |
correct |
logical value that indicates if a continuity correction is to be applied ( |
simple_output , simple.output |
logical value that indicates whether an R6 class object, including the tests' parameters and support sets, i.e. all observable p-values under each null hypothesis, is to be returned (see below). |
Details
The parameters x
and alternative
are vectorised. They are replicated
automatically, such that the number of x
's rows is the same as the length
of alternative
. This allows multiple null hypotheses to be tested
simultaneously. Since x
is (if necessary) coerced to a matrix with four
columns, it is replicated row-wise.
If exact = TRUE
, Fisher's exact test is performed (the specific hypothesis
depends on the value of alternative
). Otherwise, if exact = FALSE
, a
chi-square approximation is used for two-sided hypotheses or a normal
approximation for one-sided tests, based on the square root of the
chi-squared statistic. This is possible because the degrees of freedom of
chi-squared tests on 2-by-2 tables are limited to 1.
For exact computation, various procedures of determining two-sided p-values are implemented.
"minlike"
The standard approach in
stats::fisher.test()
andstats::binom.test()
. The probabilities of the likelihoods that are equal or less than the observed one are summed up. In Hirji (2006), it is referred to as the Probability-based approach."blaker"
The minima of the observations' lower and upper tail probabilities are combined with the opposite tail not greater than these minima. More details can be found in Blaker (2000) or Hirji (2006), where it is referred to as the Combined Tails method.
"absdist"
The probabilities of the absolute distances from the expected value that are greater than or equal to the observed one are summed up. In Hirji (2006), it is referred to as the Distance from Center approach.
"central"
The smaller values of the observations' simply doubles the minimum of lower and upper tail probabilities. In Hirji (2006), it is referred to as the Twice the Smaller Tail method.
For non-exact (i.e. continuous approximation) approaches, ts_method
is
ignored, since all its methods would yield the same p-values. More
specifically, they all converge to the doubling approach as in
ts_mthod = "central"
.
Value
If simple.output = TRUE
, a vector of computed p-values is returned.
Otherwise, the output is a DiscreteTestResults
R6 class object, which
also includes the p-value supports and testing parameters. These have to be
accessed by public methods, e.g. $get_pvalues()
.
References
Fisher, R. A. (1935). The logic of inductive inference. Journal of the Royal Statistical Society Series A, 98, pp. 39–54. doi:10.2307/2342435
Agresti, A. (2002). Categorical data analysis (2nd ed.). New York: John Wiley & Sons. pp. 91–97. doi:10.1002/0471249688
Blaker, H. (2000) Confidence curves and improved exact confidence intervals for discrete distributions. Canadian Journal of Statistics, 28(4), pp. 783-798. doi:10.2307/3315916
Hirji, K. F. (2006). Exact analysis of discrete data. New York: Chapman and Hall/CRC. pp. 55-83. doi:10.1201/9781420036190
See Also
Examples
# Constructing
S1 <- c(4, 2, 2, 14, 6, 9, 4, 0, 1)
S2 <- c(0, 0, 1, 3, 2, 1, 2, 2, 2)
N1 <- rep(148, 9)
N2 <- rep(132, 9)
F1 <- N1 - S1
F2 <- N2 - S2
df <- data.frame(S1, F1, S2, F2)
# Computation of Fisher's exact p-values (default: "minlike") and their supports
results_f <- fisher_test_pv(df)
raw_pvalues <- results_f$get_pvalues()
pCDFlist <- results_f$get_pvalue_supports()
# Computation of p-values of chi-square tests and their supports
results_c <- fisher_test_pv(df, exact = FALSE)
raw_pvalues <- results_c$get_pvalues()
pCDFlist <- results_c$get_pvalue_supports()