PCAgrid {pcaPP} | R Documentation |
(Sparse) Robust Principal Components using the Grid search algorithm
Description
Computes a desired number of (sparse) (robust) principal components using the grid search algorithm in the plane. The global optimum of the objective function is searched in planes, not in the p-dimensional space, using regular grids in these planes.
Usage
PCAgrid (x, k = 2, method = c ("mad", "sd", "qn"),
maxiter = 10, splitcircle = 25, scores = TRUE, zero.tol = 1e-16,
center = l1median, scale, trace = 0, store.call = TRUE, control, ...)
sPCAgrid (x, k = 2, method = c ("mad", "sd", "qn"), lambda = 1,
maxiter = 10, splitcircle = 25, scores = TRUE, zero.tol = 1e-16,
center = l1median, scale, trace = 0, store.call = TRUE, control, ...)
Arguments
x |
a numerical matrix or data frame of dimension ( |
k |
the desired number of components to compute |
method |
the scale estimator used to detect the direction with the
largest variance. Possible values are |
lambda |
the sparseness constraint's strength( |
maxiter |
the maximum number of iterations. |
splitcircle |
the number of directions in which the algorithm should search for the largest variance. The direction with the largest variance is searched for in the directions defined by a number of equally spaced points on the unit circle. This argument determines, how many such points are used to split the unit circle. |
scores |
A logical value indicating whether the scores of the principal component should be calculated. |
zero.tol |
the zero tolerance used internally for checking convergence, etc. |
center |
this argument indicates how the data is to be centered. It
can be a function like |
scale |
this argument indicates how the data is to be rescaled. It
can be a function like |
trace |
an integer value >= 0, specifying the tracing level. |
store.call |
a logical variable, specifying whether the function call shall be stored in the result structure. |
control |
a list which elements must be the same as (or a subset of) the parameters above. If the control object is supplied, the parameters from it will be used and any other given parameters are overridden. |
... |
further arguments passed to or from other functions. |
Details
In contrast to PCAgrid
, the function sPCAgrid
computes sparse
principal components. The strength of the applied sparseness constraint is
specified by argument lambda
.
Similar to the function princomp
, there is a print
method
for the these objects that prints the results in a nice format and the
plot
method produces a scree plot (screeplot
). There is
also a biplot
method.
Angle halving is an extension of the original algorithm. In the original
algorithm, the search directions are determined by a number of points on the
unit circle in the interval [-pi/2 ; pi/2). Angle halving means this angle is
halved in each iteration, eg. for the first approximation, the above mentioned
angle is used, for the second approximation, the angle is halved to
[-pi/4 ; pi/4) and so on. This usually gives better results with less
iterations needed.
NOTE: in previous implementations angle halving could be suppressed by the
former argument "anglehalving
". This still can be done by setting
argument maxiter = 0
.
Value
The function returns an object of class "princomp"
, i.e. a list
similar to the output of the function princomp
.
sdev |
the (robust) standard deviations of the principal components. |
loadings |
the matrix of variable loadings (i.e., a matrix whose columns
contain the eigenvectors). This is of class |
center |
the means that were subtracted. |
scale |
the scalings applied to each variable. |
n.obs |
the number of observations. |
scores |
if |
call |
the matched call. |
obj |
A vector containing the objective functions values. For function
|
lambda |
The lambda each component has been calculated with
( |
Note
See the vignette "Compiling pcaPP for Matlab" which comes with this package to compile and use these functions in Matlab.
Author(s)
Heinrich Fritz, Peter Filzmoser <P.Filzmoser@tuwien.ac.at>
References
C. Croux, P. Filzmoser, M. Oliveira, (2007). Algorithms for Projection-Pursuit Robust Principal Component Analysis, Chemometrics and Intelligent Laboratory Systems, Vol. 87, pp. 218-225.
C. Croux, P. Filzmoser, H. Fritz (2011). Robust Sparse Principal Component Analysis Based on Projection-Pursuit, ?? To appear.
See Also
Examples
# multivariate data with outliers
library(mvtnorm)
x <- rbind(rmvnorm(200, rep(0, 6), diag(c(5, rep(1,5)))),
rmvnorm( 15, c(0, rep(20, 5)), diag(rep(1, 6))))
# Here we calculate the principal components with PCAgrid
pc <- PCAgrid(x)
# we could draw a biplot too:
biplot(pc)
# now we want to compare the results with the non-robust principal components
pc <- princomp(x)
# again, a biplot for comparison:
biplot(pc)
## Sparse loadings
set.seed (0)
x <- data.Zou ()
## applying PCA
pc <- princomp (x)
## the corresponding non-sparse loadings
unclass (pc$load[,1:3])
pc$sdev[1:3]
## lambda as calculated in the opt.TPO - example
lambda <- c (0.23, 0.34, 0.005)
## applying sparse PCA
spc <- sPCAgrid (x, k = 3, lambda = lambda, method = "sd")
unclass (spc$load)
spc$sdev[1:3]
## comparing the non-sparse and sparse biplot
par (mfrow = 1:2)
biplot (pc, main = "non-sparse PCs")
biplot (spc, main = "sparse PCs")