incRpca.block {onlinePCA} | R Documentation |
Incremental PCA with Block Update
Description
Sequential Karhunen-Loeve (SKL) algorithm of Levy and Lindenbaum (2000). The PCA can be updated with respect to a data matrix (not just a data vector).
Usage
incRpca.block(x, B, lambda, U, n0 = 0, f, q = length(lambda), center, byrow = FALSE)
Arguments
x |
data matrix |
B |
block size |
lambda |
initial eigenvalues (optional) |
U |
initial eigenvectors/PCs (optional) |
n0 |
initial sample size (optional) |
f |
vector of forgetting factors |
q |
number of requested PCs |
center |
centering vector for |
byrow |
Are the data vectors in |
Details
This incremental PCA algorithm utilizes QR factorization and RSVD.
It generalizes the algorithm incRpca
from vector to matrix/block updates. However, incRpca
should be preferred for vector updates as it is faster.
If lambda
and U
are specified, they are taken as the initial PCA. Otherwise, the PCA is initialized by the SVD of the first block of data in x
(B
data vectors). The number n0
is the sample size before observing x
(default value = 0). The number B
is the size of blocks to be used in the PCA updates. Ideally, B
should be a divisor of the number of data vectors in x
(otherwise the last smaller block is discarded). If U
is provided, then B
and q
default to the number of columns (=PC) of U
.
The argument f
determines the relative weight of current PCA and new data in each block update. Its length should be equal to the number of blocks in x
, say nblock
. If n0
and B
are provided, then f
defaults to B/(n0+(0:(nblock-1)*B))
, i.e., the case where all data points have equal weights. The values in f
should be in (0,1), with higher values giving more weight to new data and less to the current PCA.
Value
A list with components
values |
first |
vectors |
first |
References
Levy, A. and Lindenbaum, M. (2000). Sequential Karhunen-Loeve basis extraction and its application to images. IEEE Transactions on Image Processing.
See Also
Examples
## Simulate Brownian Motion
n <- 100 # number of sample paths
d <- 50 # number of observation points
x <- matrix(rnorm(n*d,sd=1/sqrt(d)),n,d)
x <- t(apply(x,1,cumsum)) # dim(x) = c(100,50)
q <- 10 # number of PC to compute
B <- 20 # block size
n0 <- B # initial sample size (if relevant)
## PCA without initial values
res1 <- incRpca.block(t(x), B, q=q) # data vectors in columns
res2 <- incRpca.block(x, B, q=q, byrow=TRUE) # data vectors in rows
all.equal(res1,res2) # TRUE
## PCA with initial values
svd0 <- svd(x[1:n0,], 0, n0) # use first block for initialization
lambda <- svd0$d[1:n0]^2/n0 # initial eigenvalues
U <- svd0$v # initial PC
res3 <- incRpca.block(x[-(1:n0),], B, lambda, U, n0, q=q, byrow=TRUE)
# run PCA with this initialization on rest of the data
all.equal(res1,res3) # compare with previous PCA: TRUE
## Compare with function incRpca
res4 <- list(values=lambda, vectors=U)
for (i in (n0+1):n)
res4 <- incRpca(res4$values, res4$vectors, x[i,], i-1, q=q)
B <- 1 # vector update
res5 <- incRpca.block(x[-(1:n0),], B, lambda, U, n0, q=q, byrow=TRUE)
ind <- which(sign(res5$vectors[1,]) != sign(res4$vectors[1,]))
res5$vectors[,ind] <- - res5$vectors[,ind] # align PCs (flip orientation as needed)
all.equal(res4,res5) # TRUE