USP.test {USP} | R Documentation |
Independence test for discrete data
Description
Carry out a permutation independence test on a two-way contingency table.
The test statistic is Tn
, as described in Sections 3.1 and 7.1 of (Berrett et al. 2021).
This also appears as Un
in (Berrett and Samworth 2021).
The critical value is found by sampling null contingency tables,
with the same row and column totals as the input, via Patefield's algorithm, and recomputing
the test statistic.
Usage
USP.test(freq, B = 999, ties.method = "standard", nullstats = FALSE)
Arguments
freq |
Two-way contingency table whose independence is to be tested. |
B |
The number of resampled null tables to be used to calibrate the test. |
ties.method |
If "standard" then calculate the p-value as in (5) of (Berrett et al. 2021), which is slightly conservative. If "random" then break ties randomly. This preserves Type I error control. |
nullstats |
If TRUE, returns a vector of the null statistic values. |
Value
Returns the p-value for this independence test and the value of the test statistic, T_n
,
as defined in (Berrett et al. 2021). The third element of the list is the table of expected counts,
and the final element is the table of contributions to T_n
. If nullstats=TRUE is used, then the function also
returns a vector of the null statistics.
References
Berrett TB, Kontoyiannis I, Samworth RJ (2021). “Optimal rates for independence testing via U-statistic permutation tests.” Annals of Statistics, to appear.
Berrett TB, Samworth RJ (2021). “USP: an independence test that improves on Pearson’s chi-squared and the G-test.” Submitted, available at arXiv:2101.10880.
Examples
freq=r2dtable(1,rep(10,5),rep(10,5))[[1]] + 4*diag(rep(1,5))
USP.test(freq,999)
freq=diag(1:5); USP.test(freq,999)
freq=r2dtable(1,rep(10,5),rep(10,5))[[1]];
test=USP.test(freq,999,nullstats=TRUE)
plot(density(test$NullStats,from=0,
to=max(max(test$NullStats),test$TestStat)),
xlim=c(min(test$NullStats),max(max(test$NullStats),test$TestStat)),
main="Test Statistics")
abline(v=test$TestStat,col=2); TestStats=c(test$TestStat,test$NullStats)
abline(v=quantile(TestStats,probs=0.95),lty=2)