batchpca {onlinePCA} | R Documentation |
Batch PCA
Description
This function performs the PCA of a data matrix or covariance matrix, returning the specified number of principal components (eigenvectors) and eigenvalues.
Usage
batchpca(x, q, center, type = c("data","covariance"), byrow = FALSE)
Arguments
x |
data or covariance matrix |
q |
number of requested PCs |
center |
optional centering vector for |
type |
type of the matrix |
byrow |
Are observation vectors stored in rows (TRUE) or in columns (FALSE)? |
Details
The PCA is efficiently computed using the functions svds
or eigs_sym
of package RSpectra
, depending on the argument type
. An Implicitly Restarted Arnoldi Method (IRAM) is used in the former case and an Implicitly Restarted Lanczos Method (IRLM) in the latter.
The arguments center
and byrow
are only in effect if type
is "data"
. In this case a scaling factor 1/\sqrt{n}
(not 1/\sqrt{n-1}
)
is applied to x
before computing its singular values and vectors, where n
is the number of observation vectors stored in x
.
Value
A list with components
values |
the first |
vectors |
the first |
References
Examples
## Not run:
## Simulate data
n <- 1e4
d <- 500
q <- 10
x <- matrix(runif(n*d), n, d)
x <- x %*% diag(sqrt(12*(1:d)))
# The eigenvalues of cov(x) are approximately 1, 2, ..., d
# and the corresponding eigenvectors are approximately
# the canonical basis of R^p
## PCA computation (from fastest to slowest)
system.time(pca1 <- batchpca(scale(x,scale=FALSE), q, byrow=TRUE))
system.time(pca2 <- batchpca(cov(x), q, type="covariance"))
system.time(pca3 <- eigen(cov(x),TRUE))
system.time(pca4 <- svd(scale(x/sqrt(n-1),scale=FALSE), 0, q))
system.time(pca5 <- prcomp(x))
## End(Not run)