simultest.fisher {PEtests}R Documentation

Two-sample simultaneous test using Fisher's combination

Description

This function implements the two-sample simultaneous test on high-dimensional mean vectors and covariance matrices using Fisher's combination. Suppose \{\mathbf{X}_1, \ldots, \mathbf{X}_{n_1}\} are i.i.d. copies of \mathbf{X}, and \{\mathbf{Y}_1, \ldots, \mathbf{Y}_{n_2}\} are i.i.d. copies of \mathbf{Y}. Let p_{CQ} and p_{LC} denote the p-values associated with the l_2-norm-based mean test proposed in Chen and Qin (2010) (see meantest.cq for details) and the l_2-norm-based covariance test proposed in Li and Chen (2012) (see covtest.lc for details), respectively. The simultaneous test statistic via Fisher's combination is defined as

J_{n_1, n_2} = -2\log(p_{CQ}) -2\log(p_{LC}).

It has been proved that with some regularity conditions, under the null hypothesis H_0: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2 \ \text{ and } \ \mathbf{\Sigma}_1 = \mathbf{\Sigma}_2, the two tests are asymptotically independent as n_1, n_2, p\rightarrow \infty, and therefore J_{n_1,n_2} asymptotically converges in distribution to a \chi_4^2 distribution. The asymptotic p-value is obtained by

p\text{-value} = 1-F_{\chi_4^2}(J_{n_1,n_2}),

where F_{\chi_4^2}(\cdot) is the cdf of the \chi_4^2 distribution.

Usage

simultest.fisher(dataX,dataY)

Arguments

dataX

an n_1 by p data matrix

dataY

an n_2 by p data matrix

Value

stat the value of test statistic

pval the p-value for the test.

References

Chen, S. X. and Qin, Y. L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Annals of Statistics, 38(2):808–835.

Li, J. and Chen, S. X. (2012). Two sample tests for high-dimensional covariance matrices. The Annals of Statistics, 40(2):908–940.

Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1–14.

Examples

n1 = 100; n2 = 100; pp = 500
set.seed(1)
X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp)
Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp)
simultest.fisher(X,Y)

[Package PEtests version 0.1.0 Index]