R: CDF of Hotelling-T^2 statistic.

pHotellingT {fPASS}

R Documentation

CDF of Hotelling-`T^2` statistic.

Description

The function pHotellingT() computes the cumulative distribution function (CDF) of the two-sample Hotelling-T^2 statistic (P(T > q)) in the multivariate response setting. This function is used to compute the power function of Two-Sample (TS) Projection-based test (Wang 2021, EJS.) for sparsely observed univariate functional data.

Usage

pHotellingT(
  q,
  total_sample_size,
  mean_diff,
  sig1,
  sig2,
  alloc.ratio = c(1, 1),
  lower.tail = TRUE,
  nsim = 10000
)

Arguments

`q`	The point at which the CDF needs to be evaluated
`total_sample_size`	Target sample size, must be a positive integer.
`mean_diff`	The difference in the mean vector between the two groups, must be a vector.
`sig1`	The true (or estimate) of covariance matrix for the first group. Must be symmetric (`is.symmetric(sig1) == TRUE`) and positive definite (`chol(sig1)` without an error!).
`sig2`	The true (or estimate) of covariance matrix for the second group. Must be symmetric (`is.symmetric(sig2) == TRUE`) and positive definite (`chol(sig2)` without an error!).
`alloc.ratio`	Allocation of total sample size into the two groups. Must set as a vector of two positive numbers. For equal allocation it should be put as c(1,1), for non-equal allocation one can put c(2,1) or c(3,1) etc.
`lower.tail`	if TRUE, the CDF is returned, otherwise right tail probability is returned.
`nsim`	The number of samples to be generated from the alternate distribution.

Details

Under the assumption of the equal variance, we know that the alternative distribution of the Hotelling-T^2 statistic ((n-k-1)T/(n-2)*K) has an F distribution with the non-centrality depending on the difference between the true mean vectors and the (common) covariance of the response. However, when the true covariance of the true groups of responses differ, the alternate distribution becomes non-trivial. Koner and Luo (2023) proved that the alternate distribution of the test-statistic approximately follows a ratio of the linear combination of the K (dimension of the response) non-central chi-squared random variables (where the non-centrality parameter depends on the mean difference) and a chi-squared distribution whose degrees of freedom depends on a complicated functions of sample size in the two groups. This function initially calls the Sim_HotellingT_unequal_var function to obtain the samples from the non-null distribution and computes the CDF numerically with high precision based on a large number of samples. See Koner and Luo (2023) for more details on the formula of the non-null distribution.

Value

The CDF of the Hotelling T statistic, if lower.tail == TRUE, otherwise the right tail probability is returned.

Author(s)

Salil Koner
Maintainer: Salil Koner salil.koner@duke.edu

Examples


B           <- 10000
k           <- 4
n2          <- 60
n1_by_n2    <- 2
n1          <- n1_by_n2 * n2
mu1         <- rep(0,k)
del         <- 0.4
mu2         <- mu1 + rep(del, k) # rep(0.19,k)  # 0.23 (0.9), 0.18 (0.7) 0.20 (0.8)
sig1        <- diag(k)
sig2        <- sig1
cutoff      <- seq(0,30, length.out=20)
the_cdf     <- round(pHotellingT(cutoff, n1+n2, mu1 - mu2,
                                 sig1, sig2, alloc.ratio=c(2,1),
                                 lower.tail=FALSE, nsim = 1e4),3)

[Package fPASS version 1.0.0 Index]

CDF of Hotelling-T^2 statistic.