R: Ladle estimate for an arbitrary matrix

ladle {ICtest}

R Documentation

Ladle estimate for an arbitrary matrix

Description

The ladle estimates the rank of a symmetric matrix S by combining the classical screeplot with an estimate of the rank from the bootstrap eigenvector variability of S.

Usage

ladle(x, S, n.boots = 200, ...)

Arguments

`x`	`n` x `p` data matrix.
`S`	Function for computing a `q` x `q` symmetric matrix from the data `x`.
`n.boots`	The number of bootstrap samples.
`...`	Furhter parameters passed to `S`

Details

Assume that the eigenvalues of the population version of S are \lambda_1 >= ... >= \lambda_k > \lambda_k+1 = ... = \lambda_p. The ladle estimates the true value of k (for example the rank of S) by combining the classical screeplot with estimate of k from the bootstrap eigenvector variability of S.

For applying the ladle to either PCA, FOBI or SIR, see the dedicated functions PCAladle, FOBIladle, SIRladle.

Value

A list of class ladle containing:

`method`	The string “general”.
`k`	The estimated value of k.
`fn`	A vector giving the measures of variation of the eigenvectors using the bootstrapped eigenvectors for the different number of components.
`phin`	The normalized eigenvalues of the S matrix.
`gn`	The main criterion for the ladle estimate - the sum of fn and phin. k is the value where gn takes its minimum.
`lambda`	The eigenvalues of the covariance matrix.
`data.name`	The name of the data for which the ladle estimate was computed.

Author(s)

Joni Virta

References

Luo, W. and Li, B. (2016), Combining Eigenvalues and Variation of Eigenvectors for Order Determination, Biometrika, 103. 875-887. <doi:10.1093/biomet/asw051>

Examples

# Function for computing the left CCA matrix
S_CCA <- function(x, dim){
  x1 <- x[, 1:dim]
  x2 <- x[, -(1:dim)]
  stand <- function(x){
    x <- as.matrix(x)
    x <- sweep(x, 2, colMeans(x), "-")
    eigcov <- eigen(cov(x), symmetric = TRUE)
    x%*%(eigcov$vectors%*%diag((eigcov$values)^(-1/2))%*%t(eigcov$vectors))
  }
  
  x1stand <- stand(x1)
  x2stand <- stand(x2)
  
  crosscov <- cov(x1stand, x2stand)
  
  tcrossprod(crosscov)
}

# Toy data with two canonical components
n <- 200
x1 <- matrix(rnorm(n*5), n, 5)
x2 <- cbind(x1[, 1] + rnorm(n, sd = sqrt(0.5)),
            -1*x1[, 1] + x1[, 2] + rnorm(n, sd = sqrt(0.5)),
            matrix(rnorm(n*3), n, 3))
x <- cbind(x1, x2)

# The ladle estimate
ladle_1 <- ladle(x, S_CCA, dim = 5)
ladleplot(ladle_1)

[Package ICtest version 0.3-5 Index]