ContingencyTests {coin} | R Documentation |
Tests of Independence in Two- or Three-Way Contingency Tables
Description
Testing the independence of two nominal or ordered factors.
Usage
## S3 method for class 'formula'
chisq_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'table'
chisq_test(object, ...)
## S3 method for class 'IndependenceProblem'
chisq_test(object, ...)
## S3 method for class 'formula'
cmh_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'table'
cmh_test(object, ...)
## S3 method for class 'IndependenceProblem'
cmh_test(object, ...)
## S3 method for class 'formula'
lbl_test(formula, data, subset = NULL, weights = NULL, ...)
## S3 method for class 'table'
lbl_test(object, ...)
## S3 method for class 'IndependenceProblem'
lbl_test(object, ...)
Arguments
formula |
a formula of the form |
data |
an optional data frame containing the variables in the model formula. |
subset |
an optional vector specifying a subset of observations to be used. Defaults
to |
weights |
an optional formula of the form |
object |
an object inheriting from classes |
... |
further arguments to be passed to |
Details
chisq_test()
, cmh_test()
and lbl_test()
provide the
Pearson chi-squared test, the generalized Cochran-Mantel-Haenszel test and the
linear-by-linear association test. A general description of these methods is
given by Agresti (2002).
The null hypothesis of independence, or conditional independence given
block
, between y
and x
is tested.
If y
and/or x
are ordered factors, the default scores,
1:nlevels(y)
and 1:nlevels(x)
, respectively, can be altered
using the scores
argument (see independence_test()
); this
argument can also be used to coerce nominal factors to class "ordered"
.
(lbl_test()
coerces to class "ordered"
under any circumstances.)
If both y
and x
are ordered factors, a linear-by-linear
association test is computed and the direction of the alternative hypothesis
can be specified using the alternative
argument. For the Pearson
chi-squared test, this extension was given by Yates (1948) who also discussed
the situation when either the response or the covariate is an ordered factor;
see also Cochran (1954) and Armitage (1955) for the particular case when
y
is a binary factor and x
is ordered. The Mantel-Haenszel
statistic (Mantel and Haenszel, 1959) was similarly extended by Mantel (1963)
and Landis, Heyman and Koch (1978).
The conditional null distribution of the test statistic is used to obtain
p
-values and an asymptotic approximation of the exact distribution is
used by default (distribution = "asymptotic"
). Alternatively, the
distribution can be approximated via Monte Carlo resampling or computed
exactly for univariate two-sample problems by setting distribution
to
"approximate"
or "exact"
, respectively. See
asymptotic()
, approximate()
and
exact()
for details.
Value
An object inheriting from class "IndependenceTest"
.
Note
The exact versions of the Pearson chi-squared test and the generalized
Cochran-Mantel-Haenszel test do not necessarily result in the same
p
-value as Fisher's exact test (Davis, 1986).
References
Agresti, A. (2002). Categorical Data Analysis, Second Edition. Hoboken, New Jersey: John Wiley & Sons.
Armitage, P. (1955). Tests for linear trends in proportions and frequencies. Biometrics 11(3), 375–386. doi:10.2307/3001775
Cochran, W.G. (1954). Some methods for strengthening the common \chi^2
tests. Biometrics 10(4), 417–451. doi:10.2307/3001616
Davis, L. J. (1986). Exact tests for 2 \times 2
contingency
tables. The American Statistician 40(2), 139–141.
doi:10.1080/00031305.1986.10475377
Landis, J. R., Heyman, E. R. and Koch, G. G. (1978). Average partial association in three-way contingency tables: a review and discussion of alternative tests. International Statistical Review 46(3), 237–254. doi:10.2307/1402373
Mantel, N. and Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute 22(4), 719–748. doi:10.1093/jnci/22.4.719
Mantel, N. (1963). Chi-square tests with one degree of freedom: extensions of the Mantel-Haenszel procedure. Journal of the American Statistical Association 58(303), 690–700. doi:10.1080/01621459.1963.10500879
Yates, F. (1948). The analysis of contingency tables with groupings based on quantitative characters. Biometrika 35(1/2), 176–181. doi:10.1093/biomet/35.1-2.176
Examples
## Example data
## Davis (1986, p. 140)
davis <- matrix(
c(3, 6,
2, 19),
nrow = 2, byrow = TRUE
)
davis <- as.table(davis)
## Asymptotic Pearson chi-squared test
chisq_test(davis)
chisq.test(davis, correct = FALSE) # same as above
## Approximative (Monte Carlo) Pearson chi-squared test
ct <- chisq_test(davis,
distribution = approximate(nresample = 10000))
pvalue(ct) # standard p-value
midpvalue(ct) # mid-p-value
pvalue_interval(ct) # p-value interval
size(ct, alpha = 0.05) # test size at alpha = 0.05 using the p-value
## Exact Pearson chi-squared test (Davis, 1986)
## Note: disagrees with Fisher's exact test
ct <- chisq_test(davis,
distribution = "exact")
pvalue(ct) # standard p-value
midpvalue(ct) # mid-p-value
pvalue_interval(ct) # p-value interval
size(ct, alpha = 0.05) # test size at alpha = 0.05 using the p-value
fisher.test(davis)
## Laryngeal cancer data
## Agresti (2002, p. 107, Tab. 3.13)
cancer <- matrix(
c(21, 2,
15, 3),
nrow = 2, byrow = TRUE,
dimnames = list(
"Treatment" = c("Surgery", "Radiation"),
"Cancer" = c("Controlled", "Not Controlled")
)
)
cancer <- as.table(cancer)
## Exact Pearson chi-squared test (Agresti, 2002, p. 108, Tab. 3.14)
## Note: agrees with Fishers's exact test
(ct <- chisq_test(cancer,
distribution = "exact"))
midpvalue(ct) # mid-p-value
pvalue_interval(ct) # p-value interval
size(ct, alpha = 0.05) # test size at alpha = 0.05 using the p-value
fisher.test(cancer)
## Homework conditions and teacher's rating
## Yates (1948, Tab. 1)
yates <- matrix(
c(141, 67, 114, 79, 39,
131, 66, 143, 72, 35,
36, 14, 38, 28, 16),
byrow = TRUE, ncol = 5,
dimnames = list(
"Rating" = c("A", "B", "C"),
"Condition" = c("A", "B", "C", "D", "E")
)
)
yates <- as.table(yates)
## Asymptotic Pearson chi-squared test (Yates, 1948, p. 176)
chisq_test(yates)
## Asymptotic Pearson-Yates chi-squared test (Yates, 1948, pp. 180-181)
## Note: 'Rating' and 'Condition' as ordinal
(ct <- chisq_test(yates,
alternative = "less",
scores = list("Rating" = c(-1, 0, 1),
"Condition" = c(2, 1, 0, -1, -2))))
statistic(ct)^2 # chi^2 = 2.332
## Asymptotic Pearson-Yates chi-squared test (Yates, 1948, p. 181)
## Note: 'Rating' as ordinal
chisq_test(yates,
scores = list("Rating" = c(-1, 0, 1))) # Q = 3.825
## Change in clinical condition and degree of infiltration
## Cochran (1954, Tab. 6)
cochran <- matrix(
c(11, 7,
27, 15,
42, 16,
53, 13,
11, 1),
byrow = TRUE, ncol = 2,
dimnames = list(
"Change" = c("Marked", "Moderate", "Slight",
"Stationary", "Worse"),
"Infiltration" = c("0-7", "8-15")
)
)
cochran <- as.table(cochran)
## Asymptotic Pearson chi-squared test (Cochran, 1954, p. 435)
chisq_test(cochran) # X^2 = 6.88
## Asymptotic Cochran-Armitage test (Cochran, 1954, p. 436)
## Note: 'Change' as ordinal
(ct <- chisq_test(cochran,
scores = list("Change" = c(3, 2, 1, 0, -1))))
statistic(ct)^2 # X^2 = 6.66
## Change in size of ulcer crater for two treatment groups
## Armitage (1955, Tab. 2)
armitage <- matrix(
c( 6, 4, 10, 12,
11, 8, 8, 5),
byrow = TRUE, ncol = 4,
dimnames = list(
"Treatment" = c("A", "B"),
"Crater" = c("Larger", "< 2/3 healed",
">= 2/3 healed", "Healed")
)
)
armitage <- as.table(armitage)
## Approximative (Monte Carlo) Pearson chi-squared test (Armitage, 1955, p. 379)
chisq_test(armitage,
distribution = approximate(nresample = 10000)) # chi^2 = 5.91
## Approximative (Monte Carlo) Cochran-Armitage test (Armitage, 1955, p. 379)
(ct <- chisq_test(armitage,
distribution = approximate(nresample = 10000),
scores = list("Crater" = c(-1.5, -0.5, 0.5, 1.5))))
statistic(ct)^2 # chi_0^2 = 5.26
## Relationship between job satisfaction and income stratified by gender
## Agresti (2002, p. 288, Tab. 7.8)
## Asymptotic generalized Cochran-Mantel-Haenszel test (Agresti, p. 297)
(ct <- cmh_test(jobsatisfaction)) # CMH = 10.2001
## The standardized linear statistic
statistic(ct, type = "standardized")
## The standardized linear statistic for each block
statistic(ct, type = "standardized", partial = TRUE)
## Asymptotic generalized Cochran-Mantel-Haenszel test (Agresti, p. 297)
## Note: 'Job.Satisfaction' as ordinal
cmh_test(jobsatisfaction,
scores = list("Job.Satisfaction" = c(1, 3, 4, 5))) # L^2 = 9.0342
## Asymptotic linear-by-linear association test (Agresti, p. 297)
## Note: 'Job.Satisfaction' and 'Income' as ordinal
(lt <- lbl_test(jobsatisfaction,
scores = list("Job.Satisfaction" = c(1, 3, 4, 5),
"Income" = c(3, 10, 20, 35))))
statistic(lt)^2 # M^2 = 6.1563
## The standardized linear statistic
statistic(lt, type = "standardized")
## The standardized linear statistic for each block
statistic(lt, type = "standardized", partial = TRUE)