admm.spca {ADMM}R Documentation

Sparse PCA

Description

Sparse Principal Component Analysis aims at finding a sparse vector by solving

\textrm{max}_x~x^T\Sigma x \quad \textrm{s.t.} \quad \|x\|_2\le 1,~\|x\|_0\le K

where \|x\|_0 is the number of non-zero elements in a vector x. A convex relaxation of this problem was proposed to solve the following problem,

\textrm{max}_X~<\Sigma,X> ~\textrm{s.t.} \quad Tr(X)=1,~\|X\|_0 \le K^2, ~X\ge 0,~\textrm{rank}(X)=1

where X=xx^T is a (p\times p) matrix that is outer product of a vector x by itself, and X\ge 0 means the matrix X is positive semidefinite. With the rank condition dropped, it can be restated as

\textrm{max}_X~ <\Sigma,X>-\rho\|X\|_1 \quad \textrm{s.t.}\quad Tr(X)=1,X\ge 0.

After acquiring each principal component vector, an iterative step based on Schur complement deflation method is applied to regress out the impact of previously-computed projection vectors. It should be noted that those sparse basis may not be orthonormal.

Usage

admm.spca(
  Sigma,
  numpc,
  mu = 1,
  rho = 1,
  abstol = 1e-04,
  reltol = 0.01,
  maxiter = 1000
)

Arguments

Sigma

a (p\times p) (sample) covariance matrix.

numpc

number of principal components to be extracted.

mu

an augmented Lagrangian parameter.

rho

a regularization parameter for sparsity.

abstol

absolute tolerance stopping criterion.

reltol

relative tolerance stopping criterion.

maxiter

maximum number of iterations.

Value

a named list containing

basis

a (p\times numpc) matrix whose columns are sparse principal components.

history

a length-numpc list of dataframes recording iteration numerics. See the section for more details.

Iteration History

For SPCA implementation, main computation is sequentially performed for each projection vector. The history field is a list of length numpc, where each element is a data frame containing iteration history recording following fields over iterates,

r_norm

norm of primal residual

s_norm

norm of dual residual

eps_pri

feasibility tolerance for primal feasibility condition

eps_dual

feasibility tolerance for dual feasibility condition

In accordance with the paper, iteration stops when both r_norm and s_norm values become smaller than eps_pri and eps_dual, respectively.

References

Ma S (2013). “Alternating Direction Method of Multipliers for Sparse Principal Component Analysis.” Journal of the Operations Research Society of China, 1(2), 253–274. ISSN 2194-668X, 2194-6698, doi: 10.1007/s40305-013-0016-9.

Examples

## generate a random matrix and compute its sample covariance
X    = matrix(rnorm(1000*5),nrow=1000)
covX = stats::cov(X)

## compute 3 sparse basis
output = admm.spca(covX, 3)


[Package ADMM version 0.3.3 Index]