secularRpca {onlinePCA} | R Documentation |
Recursive PCA Using Secular Equations
Description
The PCA is recursively updated after observation of a new vector (rank one modification of the covariance matrix). Eigenvalues are computed as roots of a secular equation. Eigenvectors (principal components) are deduced by explicit calculation (Bunch et al., 1978) or approximated with the method of Gu and Eisenstat (1994).
Usage
secularRpca(lambda, U, x, n, f = 1/n, center, tol = 1e-10, reortho = FALSE)
Arguments
lambda |
vector of eigenvalues. |
U |
matrix of eigenvectors (PCs) stored in columns. |
x |
new data vector. |
n |
sample size before observing |
f |
forgetting factor: a number in (0,1). |
center |
centering vector for |
tol |
tolerance for the computation of eigenvalues. |
reortho |
if FALSE, eigenvectors are explicitly computed using the method of Bunch et al. (1978). If TRUE, they are approximated with the method of Gu and Eisenstat (1994). |
Details
The method of secular equations provides accurate eigenvalues in all but pathological cases. On the other hand, the perturbation method implemented by perturbationRpca
typically runs much faster but is only accurate for a large sample size n
.
The default eigendecomposition method is that of Bunch et al. (1978). This algorithm consists in three stages: initial deflation, nonlinear solution of secular equations, and calculation of eigenvectors.
The calculation of eigenvectors (PCs) is accurate for the first few eigenvectors but loss of accuracy and orthogonality may occur for the next ones. In contrast the method of Gu and Eisenstat (1994) is robust against small errors in the computation of eigenvalues. It provides eigenvectors that may be less accurate than the default method but for which strict orthogonality is guaranteed.
The forgetting factor f
can be interpreted as the inverse of the number of observation vectors effectively used in the PCA: the "memory" of the PCA algorithm goes back 1/f
observations in the past. For larger values of f
, the PCA update gives more relative weight to the new data x
and less to the current PCA (lambda,U
). For nonstationary processes, f
should be closer to 1.
Only one of the arguments n
and f
needs being specified. If it is n
, then f
is set to 1/n
by default (usual PCA of sample covariance matrix where all data points have equal weight). If f
is specified, its value overrides any eventual specification of n
.
Value
A list with components
values |
updated eigenvalues in decreasing order. |
vectors |
updated eigenvectors (PCs). |
References
Bunch, J.R., Nielsen, C.P., and Sorensen, D.C. (1978). Rank-one modification of the symmetric eigenproblem. Numerische Mathematik.
Gu, M. and Eisenstat, S.C. (1994). A stable and efficient algorithm for the rank-one modification of the symmetric eigenproblem. SIAM Journal of Matrix Analysis and Applications.
See Also
Examples
# Initial data set
n <- 100
d <- 50
x <- matrix(runif(n*d),n,d)
xbar <- colMeans(x)
pca0 <- eigen(cov(x))
# New observation
newx <- runif(d)
# Recursive PCA with secular equations
xbar <- updateMean(xbar, newx, n)
pca <- secularRpca(pca0$values, pca0$vectors, newx, n, center = xbar)