PCAboot {ICtest} | R Documentation |
Bootstrap-Based Testing for Subsphericity
Description
The function tests, assuming an elliptical model, that the last p-k
eigenvalues of
a scatter matrix are equal and the k
interesting components are those with a larger variance.
To obtain p-values two different bootstrapping strategies are available and the user can provide the scatter matrix to be used
as a function.
Usage
PCAboot(X, k, n.boot = 200, s.boot = "B1", S = MeanCov, Sargs = NULL)
Arguments
X |
a numeric data matrix with p>1 columns. |
k |
the number of eigenvalues larger than the equal ones. Can be between 0 and p-2. |
n.boot |
number of bootstrapping samples. |
s.boot |
bootstrapping strategy to be used. Possible values are |
S |
A function which returns a list that has as its first element a location vector and as the second element the scatter matrix. |
Sargs |
list of further arguments passed on to the function specified in |
Details
Here the function S
needs to return a list where the first argument is a location vector and the second one a scatter matrix.
The location is used to center the data and the scatter matrix is used to perform PCA.
Consider X as the centered data and denote by W the transformation matrix to the principal components. The corresponding eigenvalues
from PCA are d_1,...,d_p
. Under the null, d_k > d_{k+1} = ... = d_{p}
.
Denote further by \bar{d}
the mean of the last p-k
eigenvalues and by D^* = diag(d_1,...,d_k,\bar{d},...,\bar{d})
a p \times p
diagonal matrix. Assume that S
is the matrix with principal components which can be decomposed into S_1
and S_2
where
S_1
contains the k interesting principal components and S_2
the last p-k
principal components.
For a sample of size n
, the test statistic used for the boostrapping tests is
T = n / (\bar{d}^2) \sum_{k+1}^p (d_i - \bar{d})^2.
The function offers then two boostrapping strategies:
-
s.boot="B1"
: The first strategy has the following steps:Take a bootstrap sample
S^*
of sizen
fromS
and decompose it intoS_1^*
andS_2^*
.Every observation in
S_2^*
is transformed with a different random orthogonal matrix.Recombine
S^*=(S_1^*, S_2^*)
and createX^*= S^* W
.Compute the test statistic based on
X^*
.Repeat the previous steps
n.boot
times.
-
s.boot="B2"
: The second strategy has the following steps:Scale each principal component using the matrix
D
, i.e.Z = S D
.Take a bootstrap sample
Z^*
of sizen
fromZ
.Every observation in
Z^*
is transformed with a different random orthogonal matrix.Recreate
X^*= Z^* {D^*}^{-1} W
.Compute the test statistic based on
X^*
.Repeat the previous steps
n.boot
times.
To create the random orthogonal matrices the function
rorth
is used.
Value
A list of class ictest inheriting from class htest containing:
statistic |
the value of the test statistic. |
p.value |
the p-value of the test. |
parameter |
the degrees of freedom of the test. |
method |
character string which test was performed. |
data.name |
character string giving the name of the data. |
alternative |
character string specifying the alternative hypothesis. |
k |
the number or larger eigenvalues used in the testing problem. |
W |
the transformation matrix to the principal components. |
S |
data matrix with the centered principal components. |
D |
the underlying eigenvalues. |
MU |
the location of the data which was substracted before calculating the principal components. |
SCATTER |
The computed scatter matrix. |
scatter |
character string denoting which scatter function was used. |
s.boot |
character string denoting which bootstrapping test version was used. |
Author(s)
Klaus Nordhausen
References
Nordhausen, K., Oja, H. and Tyler, D.E. (2022), Asymptotic and Bootstrap Tests for Subspace Dimension, Journal of Multivariate Analysis, 188, 104830. <doi:10.1016/j.jmva.2021.104830>.
See Also
Examples
n <- 200
X <- cbind(rnorm(n, sd = 2), rnorm(n, sd = 1.5), rnorm(n), rnorm(n), rnorm(n))
# for demonstration purpose the n.boot is chosen small, should be larger in real applications
TestCov <- PCAboot(X, k = 2, n.boot=30)
TestCov
TestTM <- PCAboot(X, k = 1, n.boot=30, s.boot = "B2", S = "tM", Sargs = list(df=2))
TestTM