NGPPsim {ICtest} | R Documentation |
Signal Subspace Dimension Testing Using non-Gaussian Projection Pursuit
Description
Tests whether the true dimension of the signal subspace is less than or equal to a given k
. The test statistic is a multivariate extension of the classical Jarque-Bera statistic and the distribution of it under the null hypothesis is obtained by simulation.
Usage
NGPPsim(X, k, nl = c("skew", "pow3"), alpha = 0.8, N = 1000, eps = 1e-6,
verbose = FALSE, maxiter = 100)
Arguments
X |
Numeric matrix with n rows corresponding to the observations and p columns corresponding to the variables. |
k |
Number of components to estimate, |
nl |
Vector of non-linearities, a convex combination of the corresponding squared objective functions of which is then used as the projection index. The choices include |
alpha |
Vector of positive weights between 0 and 1 given to the non-linearities. The length of |
N |
Number of normal samples to be used in simulating the distribution of the test statistic under the null hypothesis. |
eps |
Convergence tolerance. |
verbose |
If |
maxiter |
Maximum number of iterations. |
Details
It is assumed that the data is a random sample from the model x = m + A s
where the latent vector s = (s_1^T, s_2^T)^T
consists of k
-dimensional non-Gaussian subvector (the signal) and p - k
-dimensional Gaussian subvector (the noise) and the components of s
are mutually independent. Without loss of generality we further assume that the components of s
have zero means and unit variances.
To test the null hypothesis H_0: k_{true} \leq k
the algorithm first estimates k + 1
components using delfation-based NGPP with the chosen non-linearities and weighting. Under the null hypothesis the distribution of the final p - k
components is standard multivariate normal and the significance of the test is obtained by comparing the objective function value of the (k + 1)
th estimated components to the same quantity estimated from N
samples of size n
from (p - k)
-dimensional standard multivariate normal distribution.
Note that if maxiter
is reached at any step of the algorithm it will use the current estimated direction and continue to the next step.
Value
A list with class 'ictest', inheriting from the class 'hctest', containing the following components:
statistic |
Test statistic, i.e. the objective function value of the ( |
p.value |
Obtained |
parameter |
Number |
method |
Character string denoting which test was performed. |
data.name |
Character string giving the name of the data. |
alternative |
Alternative hypothesis, i.e. |
k |
Tested dimension |
W |
Estimated unmixing matrix |
S |
Matrix of size |
D |
Vector of the objective function values of the signals |
MU |
Location vector of the data which was substracted before estimating the signal components. |
Author(s)
Joni Virta
References
Virta, J., Nordhausen, K. and Oja, H., (2016), Projection Pursuit for non-Gaussian Independent Components, <https://arxiv.org/abs/1612.05445>.
See Also
Examples
# Simulated data with 2 signals and 2 noise components
n <- 200
S <- cbind(rexp(n), rbeta(n, 1, 2), rnorm(n), rnorm(n))
A <- matrix(rnorm(16), ncol = 4)
X <- S %*% t(A)
# The number of simulations N should be increased in practical situations
# Now we settle for N = 100
res1 <- NGPPsim(X, 1, N = 100)
res1
screeplot(res1)
res2 <- NGPPsim(X, 2, N = 100)
res2
screeplot(res2)