R: Stochastic Gradient Ascent PCA

sgapca {onlinePCA}

R Documentation

Stochastic Gradient Ascent PCA

Description

Online PCA with the SGA algorithm of Oja (1992).

Usage

sgapca(lambda, U, x, gamma, q = length(lambda), center, 
	type = c("exact", "nn"), sort = TRUE)

Arguments

`lambda`	optional vector of eigenvalues.
`U`	matrix of eigenvectors (PC) stored in columns.
`x`	new data vector.
`gamma`	vector of gain parameters.
`q`	number of eigenvectors to compute.
`center`	optional centering vector for `x`.
`type`	algorithm implementation: "exact" or "nn" (neural network).
`sort`	Should the new eigenpairs be sorted?

Details

The gain vector gamma determines the weight placed on the new data in updating each principal component. The first coefficient of gamma corresponds to the first principal component, etc.. It can be specified as a single positive number (which is recycled by the function) or as a vector of length ncol(U). For larger values of gamma, more weight is placed on x and less on U. A common choice for (the components of) gamma is of the form c/n, with n the sample size and c a suitable positive constant.
The Stochastic Gradient Ascent PCA can be implemented exactly or through a neural network. The latter is less accurate but faster.
If sort is TRUE and lambda is not missing, the updated eigenpairs are sorted by decreasing eigenvalue. Otherwise, they are not sorted.

Value

A list with components

`values`	updated eigenvalues or NULL.
`vectors`	updated principal components.

References

Oja (1992). Principal components, Minor components, and linear neural networks. Neural Networks.

Examples

## Initialization
n <- 1e4  # sample size
n0 <- 5e3 # initial sample size
d <- 10   # number of variables
q <- d # number of PC to compute
x <- matrix(runif(n*d), n, d)
x <- x %*% diag(sqrt(12*(1:d)))
# The eigenvalues of x are close to 1, 2, ..., d
# and the corresponding eigenvectors are close to 
# the canonical basis of R^d

## SGA PCA
xbar <- colMeans(x[1:n0,])
pca <- batchpca(x[1:n0,], q, center=xbar, byrow=TRUE)
for (i in (n0+1):n) {
  xbar <- updateMean(xbar, x[i,], i-1)
  pca <- sgapca(pca$values, pca$vectors, x[i,], 2/i, q, xbar)
}
pca

[Package onlinePCA version 1.3.2 Index]