apval_Cai2014 {highmean}R Documentation

Asymptotics-Based p-value of the Test Proposed by Cai et al (2014)

Description

Calculates p-value of the test for testing equality of two-sample high-dimensional mean vectors proposed by Cai et al (2014) based on the asymptotic distribution of the test statistic.

Usage

apval_Cai2014(sam1, sam2, eq.cov = TRUE)

Arguments

sam1

an n1 by p matrix from sample population 1. Each row represents a p-dimensional sample.

sam2

an n2 by p matrix from sample population 2. Each row represents a p-dimensional sample.

eq.cov

a logical value. The default is TRUE, indicating that the two sample populations have same covariance; otherwise, the covariances are assumed to be different.

Details

Suppose that the two groups of p-dimensional independent and identically distributed samples \{X_{1i}\}_{i=1}^{n_1} and \{X_{2j}\}_{j=1}^{n_2} are observed; we consider high-dimensional data with p \gg n := n_1 + n_2 - 2. Assume that the covariances of the two sample populations are \Sigma_1 = (\sigma_{1, ij}) and \Sigma_2 = (\sigma_{2, ij}). The primary object is to test H_{0}: \mu_1 = \mu_2 versus H_{A}: \mu_1 \neq \mu_2. Let \bar{X}_{k} be the sample mean for group k = 1, 2. For a vector v, we denote v^{(i)} as its ith element.

Cai et al (2014) proposed the following test statistic:

T_{CLX} = \max_{i = 1, \ldots, p} (\bar{X}_1^{(i)} - \bar{X}_2^{(i)})^2/(\sigma_{1,ii}/n_1 + \sigma_{2, ii}/n_2),

This test statistic follows an extreme value distribution under the null hypothesis.

Value

A list including the following elements:

sam.info

the basic information about the two groups of samples, including the samples sizes and dimension.

cov.assumption

the equality assumption on the covariances of the two sample populations; this was specified by the argument eq.cov.

method

this output reminds users that the p-values are obtained using the asymptotic distributions of test statistics.

pval

the p-value of the test proposed by Cai et al (2014).

Note

This function does not transform the data with their precision matrix (see Cai et al, 2014). To calculate the p-value of the test statisic with transformation, users can use transformed samples for sam1 and sam2.

References

Cai TT, Liu W, and Xia Y (2014). "Two-sample test of high dimensional means under dependence." Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(2), 349–372.

See Also

epval_Cai2014

Examples

library(MASS)
set.seed(1234)
n1 <- n2 <- 50
p <- 200
mu1 <- rep(0, p)
mu2 <- mu1
mu2[1:10] <- 0.2
true.cov <- 0.4^(abs(outer(1:p, 1:p, "-"))) # AR1 covariance
sam1 <- mvrnorm(n = n1, mu = mu1, Sigma = true.cov)
sam2 <- mvrnorm(n = n2, mu = mu2, Sigma = true.cov)
apval_Cai2014(sam1, sam2)

# the two sample populations have different covariances
true.cov1 <- 0.2^(abs(outer(1:p, 1:p, "-")))
true.cov2 <- 0.6^(abs(outer(1:p, 1:p, "-")))
sam1 <- mvrnorm(n = n1, mu = mu1, Sigma = true.cov1)
sam2 <- mvrnorm(n = n2, mu = mu2, Sigma = true.cov2)
apval_Cai2014(sam1, sam2, eq.cov = FALSE)

[Package highmean version 3.0 Index]