R: Order Determination for Tensorial PCA Using Augmentation

tPCAaug {tensorBSS}

R Documentation

Order Determination for Tensorial PCA Using Augmentation

Description

In a tensorial PCA context the dimensions of a core tensor are estimated based on augmentation of additional noise components. Information from both eigenvectors and eigenvalues are then used to obtain the dimension estimates.

Usage

tPCAaug(x, noise = "median", naug = 1, nrep = 1, 
  sigma2 = NULL, alpha = NULL)

Arguments

`x`	array of an order at least three with the last dimension corresponding to the sampling units.
`noise`	specifies how to estimate the noise variance. Can be one of `"median"`, `"quantile"`, `"last"`, `"known"`. Default is `"median"`. See details for further information.
`naug`	number of augmented variables in each mode. Default is 1.
`nrep`	number of repetitions for the augmentation. Default is 1.
`sigma2`	if `noise = "known"` the value of the noise variance.
`alpha`	if `noise = "quantile"` this specifies the quantile to be used.

Details

For simplicity details are given for matrix valued observations.

Assume having a sample of p_1 \times p_2 matrix-valued observations which are realizations of the model X = U_L Z U_R'+ N, where U_L and U_R are matrices with orthonormal columns, Z is the random, zero mean k_1 \times k_2 core matrix with k_1 \leq p_1 and k_2 \leq p_2. N is p_1 \times p_2 matrix-variate noise that follows a matrix variate spherical distribution with E(N) = 0 and E(N N') = \sigma^2 I_{p_1} and is independent from Z. The goal is to estimate k_1 and k_2. For that purpose the eigenvalues and eigenvectors of the left and right covariances are used. To evaluate the variation in the eigenvectors, in each mode the matrix X is augmented with naug normally distributed components appropriately scaled by noise standard deviation. The procedure can be repeated nrep times to reduce random variation in the estimates.

The procedure needs an estimate of the noise variance and four options are available via the argument noise:

noise = "median": Assumes that at least half of components are noise and uses thus the median of the pooled and scaled eigenvalues as an estimate.
noise = "quantile": Assumes that at least 100 alpha % of the components are noise and uses the mean of the lower alpha quantile of the pooled and scaled eigenvalues from all modes as an estimate.
noise = "last": Uses the pooled information from all modes and then the smallest eigenvalue as estimate.
noise = "known": Assumes the error variance is known and needs to be provided via sigma2.

Value

A list of class 'taug' inheriting from class 'tladle' and containing:

`U`	list containing the modewise rotation matrices.
`D`	list containing the modewise eigenvalues.
`S`	array of the same size as `x` containing the principal components.
`ResMode`	a list with the modewise results which are lists containing: mode label for the mode. k the order estimated for that mode. fn vector giving the measures of variation of the eigenvectors. phin normalized eigenvalues. lambda the unnormalized eigenvalues used to compute phin. gn the main criterion augmented order estimator. comp vector from 0 to the number of dimensions to be evaluated.
`xmu`	the data location
`data.name`	string with the name of the input data
`method`	string `tPCA`.
`Sigma2`	estimate of standardized sigma2 from the model described above or the standardized provided value. Sigma2 is the estimate for the variance of individual entries of `N`.
`AllSigHat2`	vector of noise variances used for each mode.

Author(s)

Klaus Nordhausen, Una Radojicic

References

Radojicic, U., Lietzen, N., Nordhausen, K. and Virta, J. (2021), On order determinaton in 2D PCA. Manuscript.

Examples

library(ICtest)


# matrix-variate example
n <- 200
sig <- 0.6

Z <- rbind(sqrt(0.7)*rt(n,df=5)*sqrt(3/5),
           sqrt(0.3)*runif(n,-sqrt(3),sqrt(3)),
           sqrt(0.3)*(rchisq(n,df=3)-3)/sqrt(6),
           sqrt(0.9)*(rexp(n)-1),
           sqrt(0.1)*rlogis(n,0,sqrt(3)/pi),
           sqrt(0.5)*(rbeta(n,2,2)-0.5)*sqrt(20)
)

dim(Z) <- c(3, 2, n)

U1 <- rorth(12)[,1:3]
U2 <- rorth(8)[,1:2]
U <- list(U1=U1, U2=U2)
Y <- tensorTransform2(Z,U,1:2)
EPS <- array(rnorm(12*8*n, mean=0, sd=sig), dim=c(12,8,n))
X <- Y + EPS


TEST <- tPCAaug(X)
# Dimension should be 3 and 2 and (close to) sigma2 0.36
TEST

# Noise variance in i-th mode is equal to Sigma2 multiplied by the product 
# of number of colums of all modes except i-th one

TEST$Sigma2*c(8,12)
# This is returned as
TEST$AllSigHat2

# higher order tensor example

Z2 <- rnorm(n*3*2*4*10)

dim(Z2) <- c(3,2,4,10,n)

U2.1 <- rorth(12)[ ,1:3]
U2.2 <- rorth(8)[ ,1:2]
U2.3 <- rorth(5)[ ,1:4]
U2.4 <- rorth(20)[ ,1:10]

U2 <- list(U1 = U2.1, U2 = U2.2, U3 = U2.3, U4 = U2.4)
Y2 <- tensorTransform2(Z2, U2, 1:4)
EPS2 <- array(rnorm(12*8*5*20*n, mean=0, sd=sig), dim=c(12, 8, 5, 20, n))
X2 <- Y2 + EPS2


TEST2 <- tPCAaug(X2)
TEST2

[Package tensorBSS version 0.3.8 Index]