R: Incremental PCA

incRpca {onlinePCA}

R Documentation

Incremental PCA

Description

Online PCA using the incremental SVD method of Brand (2002) and Arora et al. (2012).

Usage

incRpca(lambda, U, x, n, f = 1/n, q = length(lambda), center, tol = 1e-7)

Arguments

`lambda`	vector of eigenvalues.
`U`	matrix of eigenvectors (principal components) stored in columns.
`x`	new data vector.
`n`	sample size before observing `x`.
`f`	forgetting factor: a number in (0,1).
`q`	number of eigenvectors to compute.
`center`	optional centering vector for `x`.
`tol`	numerical tolerance.

Details

If the Euclidean distance between x and U is more than tol, the number of eigenpairs increases to length(lambda)+1 before eventual truncation at order q. Otherwise, the eigenvectors remain unchanged and only the eigenvalues are updated.
The forgetting factor f can be interpreted as the inverse of the number of observation vectors effectively used in the PCA: the "memory" of the PCA algorithm goes back 1/f observations in the past. For larger values of f, the PCA update gives more relative weight to the new data x and less to the current PCA (lambda,U). For nonstationary processes, f should be closer to 1.
Only one of the arguments n and f needs being specified. If it is n, then f is set to 1/n by default (usual PCA of sample covariance matrix where all data points have equal weight). If f is specified, its value overrides any eventual specification of n.

Value

A list with components

`values`	updated eigenvalues in decreasing order.
`vectors`	updated eigenvectors.

References

Arora et al. (2012). Stochastic Optimization for PCA and PLS. 50th Annual Conference on Communication, Control, and Computing (Allerton).
Brand, M. (2002). Incremental singular value decomposition of uncertain data with missing values. European Conference on Computer Vision (ECCV).

Examples

## Simulate Brownian motion
n <- 100 # number of sample paths
d <- 50	 # number of observation points
q <- 10	 # number of PCs to compute
n0 <- 50 # number of sample paths used for initialization 
x <- matrix(rnorm(n*d,sd=1/sqrt(d)), n, d)
x <- t(apply(x,1,cumsum))	
dim(x) # (100,50)


## Incremental PCA (IPCA, centered)
pca <- prcomp(x[1:n0,]) # initialization
xbar <- pca$center
pca <- list(values=pca$sdev[1:q]^2, vectors=pca$rotation[,1:q])
for (i in (n0+1):n)
{
  xbar <- updateMean(xbar, x[i,], i-1)
  pca <- incRpca(pca$values, pca$vectors, x[i,], i-1, q = q,
		center = xbar)
}

## Incremental PCA (IPCA, uncentered)
pca <- prcomp(x[1:n0,],center=FALSE) # initialization
pca <- list(values = pca$sdev[1:q]^2, vectors = pca$rotation[,1:q])
for (i in (n0+1):n)
  pca <- incRpca(pca$values, pca$vectors, x[i,], i-1, q = q)

[Package onlinePCA version 1.3.2 Index]