| PCAboot {ICtest} | R Documentation |
Bootstrap-Based Testing for Subsphericity
Description
The function tests, assuming an elliptical model, that the last p-k eigenvalues of
a scatter matrix are equal and the k interesting components are those with a larger variance.
To obtain p-values two different bootstrapping strategies are available and the user can provide the scatter matrix to be used
as a function.
Usage
PCAboot(X, k, n.boot = 200, s.boot = "B1", S = MeanCov, Sargs = NULL)
Arguments
X |
a numeric data matrix with p>1 columns. |
k |
the number of eigenvalues larger than the equal ones. Can be between 0 and p-2. |
n.boot |
number of bootstrapping samples. |
s.boot |
bootstrapping strategy to be used. Possible values are |
S |
A function which returns a list that has as its first element a location vector and as the second element the scatter matrix. |
Sargs |
list of further arguments passed on to the function specified in |
Details
Here the function S needs to return a list where the first argument is a location vector and the second one a scatter matrix.
The location is used to center the data and the scatter matrix is used to perform PCA.
Consider X as the centered data and denote by W the transformation matrix to the principal components. The corresponding eigenvalues
from PCA are d_1,...,d_p. Under the null, d_k > d_{k+1} = ... = d_{p}.
Denote further by \bar{d} the mean of the last p-k eigenvalues and by D^* = diag(d_1,...,d_k,\bar{d},...,\bar{d}) a p \times p diagonal matrix. Assume that S is the matrix with principal components which can be decomposed into S_1 and S_2 where
S_1 contains the k interesting principal components and S_2 the last p-k principal components.
For a sample of size n, the test statistic used for the boostrapping tests is
T = n / (\bar{d}^2) \sum_{k+1}^p (d_i - \bar{d})^2.
The function offers then two boostrapping strategies:
-
s.boot="B1": The first strategy has the following steps:Take a bootstrap sample
S^*of sizenfromSand decompose it intoS_1^*andS_2^*.Every observation in
S_2^*is transformed with a different random orthogonal matrix.Recombine
S^*=(S_1^*, S_2^*)and createX^*= S^* W.Compute the test statistic based on
X^*.Repeat the previous steps
n.boottimes.
-
s.boot="B2": The second strategy has the following steps:Scale each principal component using the matrix
D, i.e.Z = S D.Take a bootstrap sample
Z^*of sizenfromZ.Every observation in
Z^*is transformed with a different random orthogonal matrix.Recreate
X^*= Z^* {D^*}^{-1} W.Compute the test statistic based on
X^*.Repeat the previous steps
n.boottimes.
To create the random orthogonal matrices the function
rorthis used.
Value
A list of class ictest inheriting from class htest containing:
statistic |
the value of the test statistic. |
p.value |
the p-value of the test. |
parameter |
the degrees of freedom of the test. |
method |
character string which test was performed. |
data.name |
character string giving the name of the data. |
alternative |
character string specifying the alternative hypothesis. |
k |
the number or larger eigenvalues used in the testing problem. |
W |
the transformation matrix to the principal components. |
S |
data matrix with the centered principal components. |
D |
the underlying eigenvalues. |
MU |
the location of the data which was substracted before calculating the principal components. |
SCATTER |
The computed scatter matrix. |
scatter |
character string denoting which scatter function was used. |
s.boot |
character string denoting which bootstrapping test version was used. |
Author(s)
Klaus Nordhausen
References
Nordhausen, K., Oja, H. and Tyler, D.E. (2022), Asymptotic and Bootstrap Tests for Subspace Dimension, Journal of Multivariate Analysis, 188, 104830. <doi:10.1016/j.jmva.2021.104830>.
See Also
Examples
n <- 200
X <- cbind(rnorm(n, sd = 2), rnorm(n, sd = 1.5), rnorm(n), rnorm(n), rnorm(n))
# for demonstration purpose the n.boot is chosen small, should be larger in real applications
TestCov <- PCAboot(X, k = 2, n.boot=30)
TestCov
TestTM <- PCAboot(X, k = 1, n.boot=30, s.boot = "B2", S = "tM", Sargs = list(df=2))
TestTM