R: Samples from the non-null distribution of the Hotelling-T^2...

Sim_HotellingT_unequal_var {fPASS}

R Documentation

Samples from the non-null distribution of the Hotelling-`T^2` statistic under unequal covariance.

Description

The function Sim_HotellingT_unequal_var() generates samples from the (non-null) distribution of the two-sample Hotelling-T^2 statistic under the assuming of unequal covariance of the multivariate response between the two groups. This function is used to compute the power function of Two-Sample (TS) Projection-based test (Wang 2021, EJS.) for sparsely observed univariate functional data.

Usage

Sim_HotellingT_unequal_var(
  total_sample_size,
  mean_diff,
  sig1,
  sig2,
  alloc.ratio = c(1, 1),
  nsim = 10000
)

Arguments

`total_sample_size`	Target sample size, must be a positive integer.
`mean_diff`	The difference in the mean vector between the two groups, must be a vector.
`sig1`	The true (or estimate) of covariance matrix for the first group. Must be symmetric (`is.symmetric(sig1) == TRUE`) and positive definite (`chol(sig1)` without an error!).
`sig2`	The true (or estimate) of covariance matrix for the second group. Must be symmetric (`is.symmetric(sig2) == TRUE`) and positive definite (`chol(sig2)` without an error!).
`alloc.ratio`	Allocation of total sample size into the two groups. Must set as a vector of two positive numbers. For equal allocation it should be put as c(1,1), for non-equal allocation one can put c(2,1) or c(3,1) etc.
`nsim`	The number of samples to be generated from the alternate distribution.

Details

Under the assumption of the equal variance, we know that the alternative distribution of the Hotelling-T^2 statistic has an F distribution with the non-centrality depending on the difference between the true mean vectors and the (common) covariance of the response. However, when the true covariance of the true groups of responses differ, the alternate distribution becomes non-trivial. Koner and Luo (2023) proved that the alternate distribution of the test-statistic approximately follows a ratio of the linear combination of the K (dimension of the response) non-central chi-squared random variables (where the non-centrality parameter depends on the mean difference) and a chi-squared distribution whose degrees of freedom depends on a complicated functions of sample size in the two groups. See Koner and Luo (2023) for more details on the formula of the non-null distribution.

Value

A named list with two elements.

samples - a vector of length nsim, containing The samples from the distribution of the Hotelling T statistic under unequal variance.
denom.df - The denominator degrees of freedom of the chi-square statistic obtained by approximation of the sum of two Wishart distribution under unequal variance.

Author(s)

Salil Koner
Maintainer: Salil Koner salil.koner@duke.edu

References

Wang, Qiyao (2021) Two-sample inference for sparse functional data, Electronic Journal of Statistics, Vol. 15, 1395-1423
doi:10.1214/21-EJS1802.

Examples


# Case 1: Null hypothesis is true. True mean difference is zero, and the true
# covariance of the two groups are same.
k <- 5
mu1  <- rep(0,k); del  <- 0; mu2 <- mu1 + rep(del, k);
sig1 <- diag(k); sig2 <- sig1 + del*toeplitz(c(1,rep(0.5, k-1))); n <- 200;
null.dist.samples <- Sim_HotellingT_unequal_var(total_sample_size=n, mean_diff=mu1-mu2,
                     sig1=sig1, sig2=sig2, alloc.ratio=c(1,1), nsim=1e3)
# The following Kolmogorov Smirnov test confirms that under null hypothesis
# and when the covariances are same, the distribution is exactly a
# central F distribution with \eqn{k} and \eqn{n-k}  degrees of freedom.
ks.test(null.dist.samples$samples, {{(n - 2) * k}/(n - k -1)} * {rf(n=1e3, k, n-k-1)} )


# Case 2: Alternate hypothesis is true. The mean difference is non-zero,
# and the covariances of the two groups are same:
k <- 6
mu1  <- rep(0,k); del  <- 0.15; mu2 <- mu1 + rep(del, k);
sig1 <- diag(k); sig2 <- sig1;
n1 <- 100; n2 <- 100;
alt.dist.samples <- Sim_HotellingT_unequal_var(total_sample_size=n1+n2, mean_diff=mu1-mu2,
                                               sig1=sig1, sig2=sig2, alloc.ratio=c(1,1), nsim=1e3)
ks.test(alt.dist.samples$samples,
        {(n1+n2 - 2) * k /(n1+n2 - k -1)}*rf(n=1e3, k, n1+n2-k-1,
          ncp = {(n1*n2)/(n1+n2)}*as.vector(crossprod(mu1-mu2, solve(sig1, mu1-mu2))) ) )


# Case 3: Alternate hypothesis is true. The mean difference is non-zero,
# and the covariances of the two groups are different
k <- 5
mu1  <- rep(0,k); del  <- 0.25; mu2 <- mu1 + rep(del, k);
sig1 <- diag(k); sig2 <- sig1 + del*toeplitz(c(1,rep(0.5, k-1)))
alt.dist.samples <- Sim_HotellingT_unequal_var(total_sample_size=200, mean_diff=mu1-mu2,
sig1=sig1, sig2=sig2, alloc.ratio=c(1,1), nsim=1e3)

# Generate samples with unequal allocation ratio:
k <- 8
mu1  <- rep(0,k); del  <- 0.4; mu2 <- mu1 + rep(del, k);
sig1 <- diag(k); sig2 <- sig1 + del*toeplitz(c(1,rep(0.5, k-1)))
alt.dist.samples <- Sim_HotellingT_unequal_var(total_sample_size=150, mean_diff=mu1-mu2,
sig1=sig1, sig2=sig2, alloc.ratio=c(2,1), nsim=1e3)

[Package fPASS version 1.0.0 Index]

Samples from the non-null distribution of the Hotelling-T^2 statistic under unequal covariance.