rpca {rsvd}R Documentation

Randomized principal component analysis (rpca).

Description

Fast computation of the principal components analysis using the randomized singular value decomposition.

Usage

rpca(
  A,
  k = NULL,
  center = TRUE,
  scale = TRUE,
  retx = TRUE,
  p = 10,
  q = 2,
  rand = TRUE
)

Arguments

A

array_like;
a numeric (m, n) input matrix (or data frame) to be analyzed.
If the data contain NAs na.omit is applied.

k

integer;
number of dominant principle components to be computed. It is required that k is smaller or equal to min(m,n), but it is recommended that k << min(m,n).

center

bool, optional;
logical value which indicates whether the variables should be shifted to be zero centered (TRUE by default).

scale

bool, optional;
logical value which indicates whether the variables should be scaled to have unit variance (TRUE by default).

retx

bool, optional;
logical value indicating whether the rotated variables / scores should be returned (TRUE by default).

p

integer, optional;
oversampling parameter for rsvd (default p=10), see rsvd.

q

integer, optional;
number of additional power iterations for rsvd (default q=1), see rsvd.

rand

bool, optional;
if (TRUE), the rsvd routine is used, otherwise svd is used.

Details

Principal component analysis is an important linear dimension reduction technique.

Randomized PCA is computed via the randomized SVD algorithm (rsvd). The computational gain is substantial, if the desired number of principal components is relatively small, i.e. k << min(m,n).

The print and summary method can be used to present the results in a nice format. A scree plot can be produced with ggscreeplot. The individuals factor map can be produced with ggindplot, and a correlation plot with ggcorplot.

The predict function can be used to compute the scores of new observations. The data will automatically be centered (and scaled if requested). This is not fully supported for complex input matrices.

Value

rpca returns a list with class rpca containing the following components:

rotation

array_like;
the rotation (eigenvectors); (n, k) dimensional array.

eigvals

array_like;
eigenvalues; k dimensional vector.

sdev

array_like;
standard deviations of the principal components; k dimensional vector.

x

array_like;
the scores / rotated data; (m, k) dimensional array.

center, scale

array_like;
the centering and scaling used.

Note

The principal components are not unique and only defined up to sign (a constant of modulus one in the complex case) and so may differ between different PCA implementations.

Similar to prcomp the variances are computed with the usual divisor N - 1.

Author(s)

N. Benjamin Erichson, erichson@berkeley.edu

References

See Also

ggscreeplot, ggindplot, ggcorplot, plot.rpca, predict, rsvd

Examples


library('rsvd')
#
# Load Edgar Anderson's Iris Data
#
data('iris')

#
# log transform
#
log.iris <- log( iris[ , 1:4] )
iris.species <- iris[ , 5]

#
# Perform rPCA and compute only the first two PCs
#
iris.rpca <- rpca(log.iris, k=2)
summary(iris.rpca) # Summary
print(iris.rpca) # Prints the rotations

#
# Use rPCA to compute all PCs, similar to \code{\link{prcomp}}
#
iris.rpca <- rpca(log.iris)
summary(iris.rpca) # Summary
print(iris.rpca) # Prints the rotations
plot(iris.rpca) # Produce screeplot, variable and individuls factor maps.


[Package rsvd version 1.0.5 Index]