chisq {corpora} | R Documentation |
Pearson's chi-squared statistic for frequency comparisons (corpora)
Description
This function computes Pearson's chi-squared statistic (often written
as X^2
) for frequency comparison data, with or without Yates'
continuity correction. The implementation is based on the formula
given by Evert (2004, 82).
Usage
chisq(k1, n1, k2, n2, correct = TRUE, one.sided=FALSE)
Arguments
k1 |
frequency of a type in the first corpus (or an integer vector of type frequencies) |
n1 |
the sample size of the first corpus (or an integer vector specifying the sizes of different samples) |
k2 |
frequency of the type in the second corpus (or an integer
vector of type frequencies, in parallel to |
n2 |
the sample size of the second corpus (or an integer vector
specifying the sizes of different samples, in parallel to
|
correct |
if |
one.sided |
if |
Details
The X^2
values returned by this function are identical to those
computed by chisq.test
. Unlike the latter, chisq
accepts vector arguments so that a large number of frequency
comparisons can be carried out with a single function call.
The one-sided test statistic (for one.sided=TRUE
) is the signed
square root of X^2
. It is positive for k_1/n_1 > k_2/n_2
and negative for k_1/n_1 < k_2/n_2
. Note that this statistic
has a standard normal distribution rather than a chi-squared
distribution under the null hypothesis of equal proportions.
Value
The chi-squared statistic X^2
corresponding to the specified
data (or a vector of X^2
values). This statistic has a
chi-squared distribution with df=1
under the null
hypothesis of equal proportions.
Author(s)
Stephanie Evert (https://purl.org/stephanie.evert)
References
Evert, Stefan (2004). The Statistics of Word Cooccurrences: Word Pairs and Collocations. Ph.D. thesis, Institut f?r maschinelle Sprachverarbeitung, University of Stuttgart. Published in 2005, URN urn:nbn:de:bsz:93-opus-23714. Available from http://www.collocations.de/phd.html.
See Also
chisq.pval
, chisq.test
,
cont.table
Examples
chisq.test(cont.table(99, 1000, 36, 1000))
chisq(99, 1000, 36, 1000)