Association measures {DescTools} | R Documentation |

Yule's Q and Y, Tschuprow's T

Calculate Cramer's V, Pearson's contingency coefficient and phi,
Yule's Q and Y and Tschuprow's T of `x`

, if `x`

is a table. If both, `x`

and `y`

are given, then the according table will be built first.

```
Phi(x, y = NULL, ...)
ContCoef(x, y = NULL, correct = FALSE, ...)
CramerV(x, y = NULL, conf.level = NA,
method = c("ncchisq", "ncchisqadj", "fisher", "fisheradj"),
correct = FALSE, ...)
YuleQ(x, y = NULL, ...)
YuleY(x, y = NULL, ...)
TschuprowT(x, y = NULL, correct = FALSE, ...)
```

`x` |
can be a numeric vector, a matrix or a table. |

`y` |
NULL (default) or a vector with compatible dimensions to |

`conf.level` |
confidence level of the interval. This is only implemented for Cramer's V. If set to |

`method` |
string defining the method to calculate confidence intervals for Cramer's V. One out of |

`correct` |
logical. Applying to |

`...` |
further arguments are passed to the function |

For x either a matrix or two vectors `x`

and `y`

are expected. In latter case `table(x, y, ...)`

is calculated.
The function handles `NAs`

the same way the `table`

function does, so tables are by default calculated with `NAs`

omitted.

A provided matrix is interpreted as a contingency table, which seems in the case of frequency data the natural interpretation
(this is e.g. also what `chisq.test`

expects).

Use the function `PairApply`

(pairwise apply) if the measure should be calculated pairwise for all columns.
This allows matrices of association measures to be calculated the same way `cor`

does. `NAs`

are by default omitted pairwise,
which corresponds to the `pairwise.complete`

option of `cor`

.
Use `complete.cases`

, if only the complete cases of a `data.frame`

are to be used. (see examples)

The maximum value for Phi is `\sqrt(min(r, c) - 1)`

. The contingency coefficient goes from 0 to `\sqrt(\frac{min(r, c) - 1}{min(r, c)})`

. For the corrected contingency coefficient and for Cramer's V the range is 0 to 1.

A Cramer's V in the range of [0, 0.3] is considered as weak, [0.3,0.7] as medium and > 0.7 as strong.
The minimum value for all is 0 under statistical independence.

a single numeric value if no confidence intervals are requested,

and otherwise a numeric vector with 3 elements for the estimate, the lower and the upper confidence interval

Andri Signorell <andri@signorell.net>,

Michael Smithson <michael.smithson@anu.edu.au> (confidence intervals for Cramer V)

Yule, G. Uday (1912) On the methods of measuring association between two attributes. *Journal of the Royal Statistical Society, LXXV*, 579-652

Tschuprow, A. A. (1939) *Principles of the Mathematical Theory of Correlation*, translated by M. Kantorowitsch. W. Hodge & Co.

Cramer, H. (1946) *Mathematical Methods of Statistics*. Princeton University Press

Agresti, Alan (1996) *Introduction to categorical data analysis*. NY: John Wiley and Sons

Sakoda, J.M. (1977) Measures of Association for Multivariate Contingency Tables,
*Proceedings of the Social Statistics Section of the American Statistical Association* (Part III), 777-780.

Smithson, M.J. (2003) *Confidence Intervals, Quantitative Applications in the Social Sciences Series*, No. 140. Thousand Oaks, CA: Sage. pp. 39-41

Bergsma, W. (2013) A bias-correction for Cramer's V and Tschuprow's T
*Journal of the Korean Statistical Society* 42(3) DOI: 10.1016/j.jkss.2012.10.002

`table`

, `PlotCorr`

, `PairApply`

, `Assocs`

```
tab <- table(d.pizza$driver, d.pizza$wine_delivered)
Phi(tab)
ContCoef(tab)
CramerV(tab)
TschuprowT(tab)
# just x and y
CramerV(d.pizza$driver, d.pizza$wine_delivered)
# data.frame
PairApply(d.pizza[,c("driver","operator","area")], CramerV, symmetric = TRUE)
# useNA is passed to table
PairApply(d.pizza[,c("driver","operator","area")], CramerV,
useNA="ifany", symmetric = TRUE)
d.frm <- d.pizza[,c("driver","operator","area")]
PairApply(d.frm[complete.cases(d.frm),], CramerV, symmetric = TRUE)
m <- as.table(matrix(c(2,4,1,7), nrow=2))
YuleQ(m)
YuleY(m)
# Bootstrap confidence intervals for Cramer's V
# http://support.sas.com/documentation/cdl/en/statugfreq/63124/PDF/default/statugfreq.pdf, p. 1821
tab <- as.table(rbind(
c(26,26,23,18, 9),
c( 6, 7, 9,14,23)))
d.frm <- Untable(tab)
n <- 1000
idx <- matrix(sample(nrow(d.frm), size=nrow(d.frm) * n, replace=TRUE), ncol=n, byrow=FALSE)
v <- apply(idx, 2, function(x) CramerV(d.frm[x,1], d.frm[x,2]))
quantile(v, probs=c(0.025,0.975))
# compare this to the analytical ones
CramerV(tab, conf.level=0.95)
```

[Package *DescTools* version 0.99.51 Index]