admm.spca {ADMM} | R Documentation |
Sparse PCA
Description
Sparse Principal Component Analysis aims at finding a sparse vector by solving
\textrm{max}_x~x^T\Sigma x \quad \textrm{s.t.} \quad \|x\|_2\le 1,~\|x\|_0\le K
where \|x\|_0
is the number of non-zero elements in a vector x
. A convex relaxation
of this problem was proposed to solve the following problem,
\textrm{max}_X~<\Sigma,X> ~\textrm{s.t.} \quad Tr(X)=1,~\|X\|_0 \le K^2, ~X\ge 0,~\textrm{rank}(X)=1
where X=xx^T
is a (p\times p)
matrix that is outer product of a vector x
by itself,
and X\ge 0
means the matrix X
is positive semidefinite.
With the rank condition dropped, it can be restated as
\textrm{max}_X~ <\Sigma,X>-\rho\|X\|_1 \quad \textrm{s.t.}\quad Tr(X)=1,X\ge 0.
After acquiring each principal component vector, an iterative step based on Schur complement deflation method is applied to regress out the impact of previously-computed projection vectors. It should be noted that those sparse basis may not be orthonormal.
Usage
admm.spca(
Sigma,
numpc,
mu = 1,
rho = 1,
abstol = 1e-04,
reltol = 0.01,
maxiter = 1000
)
Arguments
Sigma |
a |
numpc |
number of principal components to be extracted. |
mu |
an augmented Lagrangian parameter. |
rho |
a regularization parameter for sparsity. |
abstol |
absolute tolerance stopping criterion. |
reltol |
relative tolerance stopping criterion. |
maxiter |
maximum number of iterations. |
Value
a named list containing
- basis
a
(p\times numpc)
matrix whose columns are sparse principal components.- history
a length-
numpc
list of dataframes recording iteration numerics. See the section for more details.
Iteration History
For SPCA implementation, main computation is sequentially performed for each projection vector. The history
field is a list of length numpc
, where each element is a data frame containing iteration history recording
following fields over iterates,
- r_norm
norm of primal residual
- s_norm
norm of dual residual
- eps_pri
feasibility tolerance for primal feasibility condition
- eps_dual
feasibility tolerance for dual feasibility condition
In accordance with the paper, iteration stops when both r_norm
and s_norm
values
become smaller than eps_pri
and eps_dual
, respectively.
References
Ma S (2013). “Alternating Direction Method of Multipliers for Sparse Principal Component Analysis.” Journal of the Operations Research Society of China, 1(2), 253–274. ISSN 2194-668X, 2194-6698, doi: 10.1007/s40305-013-0016-9.
Examples
## generate a random matrix and compute its sample covariance
X = matrix(rnorm(1000*5),nrow=1000)
covX = stats::cov(X)
## compute 3 sparse basis
output = admm.spca(covX, 3)