Sim_HotellingT_unequal_var {fPASS} | R Documentation |
Samples from the non-null distribution of the Hotelling-T^2
statistic under unequal covariance.
Description
The function Sim_HotellingT_unequal_var()
generates samples from the
(non-null) distribution of the two-sample Hotelling-T^2
statistic
under the assuming of unequal covariance of the multivariate response
between the two groups. This function is used to compute the power function
of Two-Sample (TS) Projection-based test (Wang 2021, EJS.)
for sparsely observed univariate functional data.
Usage
Sim_HotellingT_unequal_var(
total_sample_size,
mean_diff,
sig1,
sig2,
alloc.ratio = c(1, 1),
nsim = 10000
)
Arguments
total_sample_size |
Target sample size, must be a positive integer. |
mean_diff |
The difference in the mean vector between the two groups, must be a vector. |
sig1 |
The true (or estimate) of covariance matrix for the first group. Must be symmetric
( |
sig2 |
The true (or estimate) of covariance matrix for the second group. Must be symmetric
( |
alloc.ratio |
Allocation of total sample size into the two groups. Must set as a vector of two positive numbers. For equal allocation it should be put as c(1,1), for non-equal allocation one can put c(2,1) or c(3,1) etc. |
nsim |
The number of samples to be generated from the alternate distribution. |
Details
Under the assumption of the equal variance, we know that the alternative
distribution of the Hotelling-T^2
statistic has an F distribution with the
non-centrality depending on the difference between the true mean vectors and the
(common) covariance of the response. However, when the true covariance of the true groups
of responses differ, the alternate distribution becomes non-trivial. Koner and Luo (2023)
proved that the alternate distribution of the test-statistic approximately follows
a ratio of the linear combination of the K (dimension of the response) non-central
chi-squared random variables (where the non-centrality parameter depends on the mean difference)
and a chi-squared distribution whose degrees of freedom depends on a complicated functions of
sample size in the two groups.
See Koner and Luo (2023) for more details on the formula of the non-null distribution.
Value
A named list with two elements.
-
samples
- a vector of lengthnsim
, containing The samples from the distribution of the Hotelling T statistic under unequal variance. -
denom.df
- The denominator degrees of freedom of the chi-square statistic obtained by approximation of the sum of two Wishart distribution under unequal variance.
Author(s)
Salil Koner
Maintainer: Salil Koner
salil.koner@duke.edu
References
Wang, Qiyao (2021)
Two-sample inference for sparse functional data, Electronic Journal of Statistics,
Vol. 15, 1395-1423
doi:10.1214/21-EJS1802.
See Also
Hotelling::hotelling.test()
, Hotelling::hotelling.stat()
to generate empirical samples
from the Hotelling T-statistic from empirical data.
Examples
# Case 1: Null hypothesis is true. True mean difference is zero, and the true
# covariance of the two groups are same.
k <- 5
mu1 <- rep(0,k); del <- 0; mu2 <- mu1 + rep(del, k);
sig1 <- diag(k); sig2 <- sig1 + del*toeplitz(c(1,rep(0.5, k-1))); n <- 200;
null.dist.samples <- Sim_HotellingT_unequal_var(total_sample_size=n, mean_diff=mu1-mu2,
sig1=sig1, sig2=sig2, alloc.ratio=c(1,1), nsim=1e3)
# The following Kolmogorov Smirnov test confirms that under null hypothesis
# and when the covariances are same, the distribution is exactly a
# central F distribution with \eqn{k} and \eqn{n-k} degrees of freedom.
ks.test(null.dist.samples$samples, {{(n - 2) * k}/(n - k -1)} * {rf(n=1e3, k, n-k-1)} )
# Case 2: Alternate hypothesis is true. The mean difference is non-zero,
# and the covariances of the two groups are same:
k <- 6
mu1 <- rep(0,k); del <- 0.15; mu2 <- mu1 + rep(del, k);
sig1 <- diag(k); sig2 <- sig1;
n1 <- 100; n2 <- 100;
alt.dist.samples <- Sim_HotellingT_unequal_var(total_sample_size=n1+n2, mean_diff=mu1-mu2,
sig1=sig1, sig2=sig2, alloc.ratio=c(1,1), nsim=1e3)
ks.test(alt.dist.samples$samples,
{(n1+n2 - 2) * k /(n1+n2 - k -1)}*rf(n=1e3, k, n1+n2-k-1,
ncp = {(n1*n2)/(n1+n2)}*as.vector(crossprod(mu1-mu2, solve(sig1, mu1-mu2))) ) )
# Case 3: Alternate hypothesis is true. The mean difference is non-zero,
# and the covariances of the two groups are different
k <- 5
mu1 <- rep(0,k); del <- 0.25; mu2 <- mu1 + rep(del, k);
sig1 <- diag(k); sig2 <- sig1 + del*toeplitz(c(1,rep(0.5, k-1)))
alt.dist.samples <- Sim_HotellingT_unequal_var(total_sample_size=200, mean_diff=mu1-mu2,
sig1=sig1, sig2=sig2, alloc.ratio=c(1,1), nsim=1e3)
# Generate samples with unequal allocation ratio:
k <- 8
mu1 <- rep(0,k); del <- 0.4; mu2 <- mu1 + rep(del, k);
sig1 <- diag(k); sig2 <- sig1 + del*toeplitz(c(1,rep(0.5, k-1)))
alt.dist.samples <- Sim_HotellingT_unequal_var(total_sample_size=150, mean_diff=mu1-mu2,
sig1=sig1, sig2=sig2, alloc.ratio=c(2,1), nsim=1e3)