| admm.spca {ADMM} | R Documentation |
Sparse PCA
Description
Sparse Principal Component Analysis aims at finding a sparse vector by solving
\textrm{max}_x~x^T\Sigma x \quad \textrm{s.t.} \quad \|x\|_2\le 1,~\|x\|_0\le K
where \|x\|_0 is the number of non-zero elements in a vector x. A convex relaxation
of this problem was proposed to solve the following problem,
\textrm{max}_X~<\Sigma,X> ~\textrm{s.t.} \quad Tr(X)=1,~\|X\|_0 \le K^2, ~X\ge 0,~\textrm{rank}(X)=1
where X=xx^T is a (p\times p) matrix that is outer product of a vector x by itself,
and X\ge 0 means the matrix X is positive semidefinite.
With the rank condition dropped, it can be restated as
\textrm{max}_X~ <\Sigma,X>-\rho\|X\|_1 \quad \textrm{s.t.}\quad Tr(X)=1,X\ge 0.
After acquiring each principal component vector, an iterative step based on Schur complement deflation method is applied to regress out the impact of previously-computed projection vectors. It should be noted that those sparse basis may not be orthonormal.
Usage
admm.spca(
Sigma,
numpc,
mu = 1,
rho = 1,
abstol = 1e-04,
reltol = 0.01,
maxiter = 1000
)
Arguments
Sigma |
a |
numpc |
number of principal components to be extracted. |
mu |
an augmented Lagrangian parameter. |
rho |
a regularization parameter for sparsity. |
abstol |
absolute tolerance stopping criterion. |
reltol |
relative tolerance stopping criterion. |
maxiter |
maximum number of iterations. |
Value
a named list containing
- basis
a
(p\times numpc)matrix whose columns are sparse principal components.- history
a length-
numpclist of dataframes recording iteration numerics. See the section for more details.
Iteration History
For SPCA implementation, main computation is sequentially performed for each projection vector. The history
field is a list of length numpc, where each element is a data frame containing iteration history recording
following fields over iterates,
- r_norm
norm of primal residual
- s_norm
norm of dual residual
- eps_pri
feasibility tolerance for primal feasibility condition
- eps_dual
feasibility tolerance for dual feasibility condition
In accordance with the paper, iteration stops when both r_norm and s_norm values
become smaller than eps_pri and eps_dual, respectively.
References
Ma S (2013). “Alternating Direction Method of Multipliers for Sparse Principal Component Analysis.” Journal of the Operations Research Society of China, 1(2), 253–274. ISSN 2194-668X, 2194-6698, doi: 10.1007/s40305-013-0016-9.
Examples
## generate a random matrix and compute its sample covariance
X = matrix(rnorm(1000*5),nrow=1000)
covX = stats::cov(X)
## compute 3 sparse basis
output = admm.spca(covX, 3)