Two-sample high-dimensional mean test (Chen and Qin, 2010)
Description
This function implements the two-sample l2-norm-based high-dimensional
mean test proposed by Chen and Qin (2010).
Suppose {X1,…,Xn1} are i.i.d.
copies of X, and {Y1,…,Yn2}
are i.i.d. copies of Y.
The test statistic MCQ is defined as
Under the null hypothesis H0m:μ1=μ2,
the leading variance of MCQ is
σMCQ2=n1(n1−1)2tr(Σ12)+n2(n2−1)2tr(Σ22)+n1n24tr(Σ1Σ2),
which can be consistently estimated by σMCQ2=n1(n1−1)2tr(Σ12)+n2(n2−1)2tr(Σ22)+n1n24tr(Σ1Σ2).
The explicit formulas of tr(Σ12),
tr(Σ22), and
tr(Σ1Σ2)
can be found in Section 3 of Chen and Qin (2010).
With some regularity conditions, under the null hypothesis
H0m:μ1=μ2,
the test statistic MCQ converges in distribution to a standard normal distribution
as n1,n2,p→∞.
The asymptotic p-value is obtained by
pCQ=1−Φ(MCQ/σ^MCQ),
where Φ(⋅) is the cdf of the standard normal distribution.
Usage
meantest.cq(dataX,dataY)
Arguments
dataX
an n1 by p data matrix
dataY
an n2 by p data matrix
Value
stat the value of test statistic
pval the p-value for the test.
References
Chen, S. X. and Qin, Y. L. (2010). A two-sample test for high-dimensional data
with applications to gene-set testing.
Annals of Statistics, 38(2):808–835.