pcaone {pcaone} | R Documentation |
Randomized Singular Value Decomposition Algorithm with Window-based Power Iterations from PCAone (Li et al 2022).
Description
The Randomized Singular Value Decomposition (RSVD) computes the near-optimal low-rank approximation of a rectangular matrix using a fast probablistic algorithm.
Usage
pcaone(
A,
k = NULL,
p = 7,
q = 10,
sdist = "normal",
method = "alg2",
windows = 64,
shuffle = FALSE
)
Arguments
A |
array_like; |
k |
integer; |
p |
integer, optional; |
q |
integer, optional; |
sdist |
string |
method |
string |
windows |
integer, optional; |
shuffle |
logical, optional; |
Details
The singular value decomposition (SVD) plays an important role in data analysis, and scientific computing.
Given a rectangular (m,n)
matrix A
, and a target rank k << min(m,n)
,
the SVD factors the input matrix A
as
A = U_{k} diag(d_{k}) V_{k}^\top
The k
left singular vectors are the columns of the
real or complex unitary matrix U
. The k
right singular vectors are the columns
of the real or complex unitary matrix V
. The k
dominant singular values are the
entries of d
, and non-negative and real numbers.
q
is an oversampling parameter to improve the approximation.
A value of at least 10 is recommended, and q=10
is set by default.
The parameter p
specifies the number of power (subspace) iterations
to reduce the approximation error. The power scheme is recommended,
especially when the singular values decay slowly. However, computing power iterations increases the
computational costs. Even though most RSVD implementations recommend p=3
power iterations by default,
it's always sufficient to run only few power iterations where our window-based power iterations ('alg2'
)
come to play. We recommend using windows=64
and p>=7
for pcaone algorithm2. As it is designed for large dataset,
we recommend using 'alg2'
when max(n,m) > 5000
.
If k > (min(n,m)/4)
, a deterministic partial or truncated svd
algorithm might be faster.
Value
pcaone
returns a list containing the following three components:
- d
array_like;
singular values; vector of length(k)
.- u
array_like;
left singular vectors;(m, k)
or(m, nu)
dimensional array.- v
array_like;
right singular vectors;(n, k)
or(n, nv)
dimensional array.
Note
The singular vectors are not unique and only defined up to sign. If a left singular vector has its sign changed, changing the sign of the corresponding right vector gives an equivalent decomposition.
Author(s)
Zilong Li zilong.dk@gmail.com
References
Z. Li, J Meisner, A Albrechtsen. "PCAone: fast and accurate out-of-core PCA framework for large scale biobank data" (2022) doi: 10.1101/2022.05.25.493261.
Examples
library('pcaone')
mat <- matrix(rnorm(100*20000), 100, 20000)
res <- pcaone(mat, k = 10, p = 7, method = "alg2")
str(res)
res <- pcaone(mat, k = 10, p = 7, method = "alg1")
str(res)