ICSboot {ICtest} | R Documentation |
Boostrap-based Testing for the Number of Gaussian Components in NGCA Using Two Scatter Matrices
Description
In independent components analysis (ICA) gaussian components are considered as uninteresting.
The function uses boostrappping tests, based on ICS using any combination of two scatter matrices, to decide if there are p-k
gaussian components where p
is the dimension of the data.
The function offers two different boostrapping strategies.
Usage
ICSboot(X, k, S1=cov, S2=cov4, S1args=NULL, S2args=NULL, n.boot = 200, s.boot = "B1")
Arguments
X |
a numeric data matrix with p>1 columns. |
k |
the number of non-gaussian components under the null. |
S1 |
name of the first scatter matrix function. Can only return a matrix. Default is |
.
S2 |
name of the second scatter matrix function. Can only return a matrix. Default is |
S1args |
list with optional additional arguments for |
S2args |
list with optional additional arguments for |
n.boot |
number of bootstrapping samples. |
s.boot |
bootstrapping strategy to be used. Possible values are |
Details
While in FOBIasymp
and FOBIboot
the two scatters used are always cov
and cov4
this function can be used with any two scatter functions. In that case however the value of the Gaussian eigenvalues are in general not known and depend on the scatter functions used. Therefore the test uses as test statistic the k
successive eigenvalues with the smallest variance. Which means the default here might differ from FOBIasymp
and FOBIboot
.
Given eigenvalues d_1,...,d_p
the function thus orders the components in descending order according to the "variance" criterion .
Under the null it is then assumed that the first k
interesting components are mutually independent and non-normal and the last p-k
components are gaussian.
Let d_1,...,d_p
be the ordered eigenvalues, W
the correspondingly ordered unmixing matrix, s_i = W (x_i-MU)
the corresponding
source vectors which give the source matrix S
which can be decomposed into S_1
and S_2
where S_1
is the matrix with the k
non-gaussian components
and S_2
the matrix with the gaussian components (under the null).
Two possible bootstrap tests are provided for testing that the last p-k
components are gaussian and independent from the first k components:
-
s.boot="B1"
: The first strategy has the followong steps:Take a bootstrap sample
S_1^*
of sizen
fromS_1
.Take a bootstrap sample
S_2^*
consisting of a matrix with gaussian random variables havingcov(S_2)
.Combine
S^*=(S_1^*, S_2^*)
and createX^*= S^* W
.Compute the test statistic based on
X^*
.Repeat the previous steps
n.boot
times.
Note that in this bootstrapping test the assumption of ”independent components” is not used, it is only used that the last
p-k
components are gaussian and independent from the firstk
components. Therefore this strategy can be applied in an independent component analysis (ICA) framework and in a non-gaussian components analysis (NGCA) framework. -
s.boot="B2"
: The second strategy has the following steps:Take a bootstrap sample
S_1^*
of sizen
fromS_1
where the subsampling is done separately for each independent component.Take a bootstrap sample
S_2^*
consisting of a matrix with gaussian random variables havingcov(S_2)
Combine
S^*=(S_1^*, S_2^*)
and createX^*= S^* W
.Compute the test statistic based on
X^*
.Repeat the previous steps
n.boot
times.
This bootstrapping strategy assumes a full ICA model and cannot be used in an NGCA framework. Note that when the goal is to recover the non-gaussian independent components both scatters used must have the independence property.
Value
A list of class ictest inheriting from class htest containing:
statistic |
the value of the test statistic. |
p.value |
the p-value of the test. |
parameter |
the number of boostrapping samples used to obtain the p-value. |
method |
character string which test was performed and which scatters were used. |
data.name |
character string giving the name of the data. |
alternative |
character string specifying the alternative hypothesis. |
k |
the number or non-gaussian components used in the testing problem. |
W |
the transformation matrix to the independent components. Also known as unmixing matrix. |
S |
data matrix with the centered independent components. |
D |
the underlying eigenvalues. |
MU |
the location of the data which was substracted before calculating the independent components. |
s.boot |
character string which boostrapping strategy was used. |
Author(s)
Klaus Nordhausen
References
Nordhausen, K., Oja, H. and Tyler, D.E. (2022), Asymptotic and Bootstrap Tests for Subspace Dimension, Journal of Multivariate Analysis, 188, 104830. <doi:10.1016/j.jmva.2021.104830>.
Nordhausen, K., Oja, H., Tyler, D.E. and Virta, J. (2017), Asymptotic and Bootstrap Tests for the Dimension of the Non-Gaussian Subspace, Signal Processing Letters, 24, 887–891. <doi:10.1109/LSP.2017.2696880>.
Radojicic, U. and Nordhausen, K. (2020), Non-Gaussian Component Analysis: Testing the Dimension of the Signal Subspace. In Maciak, M., Pestas, M. and Schindler, M. (editors) "Analytical Methods in Statistics. AMISTAT 2019", 101–123, Springer, Cham. <doi:10.1007/978-3-030-48814-7_6>.
See Also
Examples
n <- 750
S <- cbind(runif(n), rchisq(n, 2), rexp(n), rnorm(n), rnorm(n), rnorm(n))
A <- matrix(rnorm(36), ncol = 6)
X <- S %*% t(A)
# n.boot is small for demonstration purpose, should be larger
ICSboot(X, k=1, n.boot=20)
if(require("ICSNP")){
myTyl <- function(X,...) HR.Mest(X,...)$scatter
myT <- function(X,...) tM(X,...)$V
# n.boot is small for demonstration purpose, should be larger
ICSboot(X, k=3, S1=myT, S2=myTyl, s.boot = "B2", n.boot=20)
}