R: compute a low rank soft-thresholded svd by alternating...

svd.als {softImpute}

R Documentation

compute a low rank soft-thresholded svd by alternating orthogonal ridge regression

Description

fit a low-rank svd to a complete matrix by alternating orthogonal ridge regression. Special sparse-matrix classes available for very large matrices, including "SparseplusLowRank" versions for row and column centered sparse matrices.

Usage

svd.als(x, rank.max = 2, lambda = 0, thresh = 1e-05, maxit = 100,
        trace.it = FALSE, warm.start = NULL, final.svd = TRUE)

Arguments

`x`	An m by n matrix. Large matrices can be in "sparseMatrix" format, as well as "SparseplusLowRank". The latter arise after centering sparse matrices, for example with `biScale`, as well as in applications such as `softImpute`.
`rank.max`	The maximum rank for the solution. This is also the dimension of the left and right matrices of orthogonal singular vectors. 'rank.max' should be no bigger than 'min(dim(x)'.
`lambda`	The regularization parameter. `lambda=0` corresponds to an accelerated version of the orthogonal QR-algorithm. With `lambda>0` the algorithm amounts to alternating orthogonal ridge regression.
`thresh`	convergence threshold, measured as the relative changed in the Frobenius norm between two successive estimates.
`maxit`	maximum number of iterations.
`trace.it`	with `trace.it=TRUE`, convergence progress is reported.
`warm.start`	an svd object can be supplied as a warm start. If the solution requested has higher rank than the warm start, the additional subspace is initialized with random Gaussians (and then orthogonalized wrt the rest).
`final.svd`	Although in theory, this algorithm converges to the solution to a nuclear-norm regularized low-rank matrix approximation problem, with potentially some singular values equal to zero, in practice only near-zeros are achieved. This final step does one more iteration with `lambda=0`, followed by soft-thresholding.

Details

This algorithm solves the problem

\min ||X-M||_F^2 +\lambda ||M||_*

subject to rank(M)\leq r, where ||M||_* is the nuclear norm of M (sum of singular values). It achieves this by solving the related problem

\min ||X-AB'||_F^2 +\lambda/2 (||A||_F^2+||B||_F^2)

subject to rank(A)=rank(B)\leq r. The solution is a rank-restricted, soft-thresholded SVD of X.

Value

An svd object is returned, with components "u", "d", and "v".

`u`	an m by `rank.max` matrix with the left orthogonal singular vectors
`d`	a vector of length `rank.max` of soft-thresholded singular values
`v`	an n by `rank.max` matrix with the right orthogonal singular vectors

Author(s)

Trevor Hastie, Rahul Mazumder
Maintainer: Trevor Hastie hastie@stanford.edu

References

Rahul Mazumder, Trevor Hastie and Rob Tibshirani (2010) Spectral Regularization Algorithms for Learning Large Incomplete Matrices, https://web.stanford.edu/~hastie/Papers/mazumder10a.pdf
Journal of Machine Learning Research 11 (2010) 2287-2322

Examples

#create a matrix and run the algorithm
set.seed(101)
n=100
p=50
J=25
np=n*p
x=matrix(rnorm(n*J),n,J)%*%matrix(rnorm(J*p),J,p)+matrix(rnorm(np),n,p)/5
fit=svd.als(x,rank=25,lambda=50)
fit$d
pmax(svd(x)$d-50,0)
# now create a sparse matrix and do the same
nnz=trunc(np*.3)
inz=sample(seq(np),nnz,replace=FALSE)
i=row(x)[inz]
j=col(x)[inz]
x=rnorm(nnz)
xS=sparseMatrix(x=x,i=i,j=j)
fit2=svd.als(xS,rank=20,lambda=7)
fit2$d
pmax(svd(as.matrix(xS))$d-7,0)

[Package softImpute version 1.4-1 Index]