binom.pval {corpora} | R Documentation |
P-values of the binomial test for frequency counts (corpora)
Description
This function computes the p-value of a binomial test for frequency
counts. In the two-sided case, a “central” p-value (Fay 2010)
provides better numerical efficiency than the likelihood-based approach
of binom.test
and is always consistent with confidence intervals.
Usage
binom.pval(k, n, p = 0.5,
alternative = c("two.sided", "less", "greater"))
Arguments
k |
frequency of a type in the corpus (or an integer vector of frequencies) |
n |
number of tokens in the corpus, i.e. sample size (or an integer vector specifying the sizes of different samples) |
p |
null hypothesis, giving the assumed proportion of this type in the population (or a vector of proportions for different types and/or different populations) |
alternative |
a character string specifying the alternative
hypothesis; must be one of |
Details
For alternative="two.sided"
(the default), a “central” p-value
is computed (Fay 2010: 53f), which differs from the likelihood-based two-sided
p-value determined by binom.test
(the “minlike” method in Fay's
terminology). This approach has two advantages: (i) it is numerically robust
and efficient, even for very large samples and frequency counts; (ii) it is
always consistent with Clopper-Pearson confidence intervals (see examples below).
Value
The p-value of a binomial test applied to the given data (or a vector of p-values).
Author(s)
Stephanie Evert (https://purl.org/stephanie.evert)
References
Fay, Michael P. (2010). Two-sided exact tests and matching confidence intervals for discrete data. The R Journal, 2(1), 53-58.
See Also
Examples
# inconsistency btw likelihood-based two-sided binomial test and confidence interval
binom.test(2, 10, p=0.555)
# central two-sided test as implemented by binom.pval is always consistent
binom.pval(2, 10, p=0.555)
prop.cint(2, 10, method="binomial")