fast.prcomp {gmodels} | R Documentation |
Efficient computation of principal components and singular value decompositions.
Description
The standard stats::prcomp()
and svd()
function are
very inefficient for wide matrixes. fast.prcomp
and fast.svd
are modified versions which are efficient even for matrixes that are very
wide.
Usage
fast.prcomp(x, retx = TRUE, center = TRUE, scale. = FALSE, tol = NULL)
Arguments
x |
data matrix |
retx |
a logical value indicating whether the rotated variables should be returned. |
center |
a logical value indicating whether the variables
should be shifted to be zero centered. Alternately, a vector of
length equal the number of columns of |
scale. |
a logical value indicating whether the variables should
be scaled to have unit variance before the analysis takes
place. The default is |
tol |
a value indicating the magnitude below which components
should be omitted. (Components are omitted if their
standard deviations are less than or equal to |
Details
The current implementation of the function svd()
in S-Plus and R
is much slower when operating on a matrix with a large number of columns
than on the transpose of this matrix, which has a large number of rows. As a
consequence, stats::prcomp()
, which uses svd()
, is
also very slow when applied to matrixes with a large number of rows.
The simple solution is to use La.svd()
instead of
svd()
. A suitable patch to stats::prcomp()
has been
submitted. In the mean time, the function fast.prcomp
has been
provided as a short-term work-around.
- list("fast.prcomp")
is a modified versiom of
stats::prcomp()
that callsLa.svd()
instead ofsvd()
- list("fast.svd")
is simply a wrapper around
La.svd()
.
Value
See the documetation for stats::prcomp()
or
svd()
.
Author(s)
Modifications by Gregory R. Warnes greg@warnes.net
See Also
stats::prcomp()
, base::svd()
, base::La.svd()
Examples
# create test matrix
set.seed(4943546)
nr <- 50
nc <- 2000
x <- matrix( rnorm( nr*nc), nrow=nr, ncol=nc )
tx <- t(x)
# SVD directly on matrix is SLOW:
system.time( val.x <- svd(x)$u )
# SVD on t(matrix) is FAST:
system.time( val.tx <- svd(tx)$v )
# and the results are equivalent:
max( abs(val.x) - abs(val.tx) )
# Time gap dissapears using fast.svd:
system.time( val.x <- fast.svd(x)$u )
system.time( val.tx <- fast.svd(tx)$v )
max( abs(val.x) - abs(val.tx) )
library(stats)
# prcomp directly on matrix is SLOW:
system.time( pr.x <- prcomp(x) )
# prcomp.fast is much faster
system.time( fast.pr.x <- fast.prcomp(x) )
# and the results are equivalent
max( pr.x$sdev - fast.pr.x$sdev )
max( abs(pr.x$rotation[,1:49]) - abs(fast.pr.x$rotation[,1:49]) )
max( abs(pr.x$x) - abs(fast.pr.x$x) )
# (except for the last and least significant component):
max( abs(pr.x$rotation[,50]) - abs(fast.pr.x$rotation[,50]) )