selectLambda {rospca} | R Documentation |
Selection of sparsity parameter using IC
Description
Selection of the sparsity parameter for ROSPCA and SCoTLASS using BIC of Hubert et al. (2016), and for SRPCA using BIC of Croux et al. (2013).
Usage
selectLambda(X, k, kmax = 10, method = "ROSPCA", lmin = 0, lmax = 2, lstep = 0.02,
alpha = 0.75, stand = TRUE, skew = FALSE, multicore = FALSE,
mc.cores = NULL, P = NULL, ndir = "all")
Arguments
X |
An |
k |
Number of Principal Components (PCs). |
kmax |
Maximal number of PCs to be computed, only used when |
method |
PCA method to use: ROSPCA ( |
lmin |
Minimal value of |
lmax |
Maximal value of |
lstep |
Difference between two consecutive values of |
alpha |
Robustness parameter for ROSPCA, default is 0.75. |
stand |
Logical indicating if the data should be standardised, default is |
skew |
Logical indicating if the skewed version of ROSPCA should be applied, default is |
multicore |
Logical indicating if multiple cores can be used, default is |
mc.cores |
Number of cores to use if |
P |
True loadings matrix, a numeric matrix of size |
ndir |
Number of directions used when computing the outlyingness (or the adjusted outlyingness when |
Details
We select an optimal value of \lambda
for a certain method on a certain dataset by looking at an equidistant grid of \lambda
values. For each value of \lambda
, we apply the method on the dataset using this sparsity parameter, and compute an Information Criterion (IC). The optimal value of \lambda
is then the one corresponding to the minimal IC. The ICs we consider are the BIC of for Hubert et al. (2016) for ROSPCA and SCoTLASS, and the BIC of Croux et al. (2013) for SRPCA.
The BIC of Hubert et al. (2016) is defined as
BIC(\lambda)=\ln(1/(h_1p)\sum_{i=1}^{h_1} OD^2_{(i)}(\lambda))+df(\lambda)\ln(h_1p)/(h_1p),
where h_1
is the size of H_1
(the subset of observations that are kept in the non-sparse reweighting step) and OD_{(i)}(\lambda)
is the i
th smallest orthogonal distance for the model when using \lambda
as the sparsity parameter. The degrees of freedom df(\lambda)
are the number of non-zero loadings when \lambda
is used as the sparsity parameter.
Value
A list with components:
opt.lambda |
Value of |
min.IC |
Minimal value of IC. |
Lambda |
Numeric vector containing the used values of |
IC |
Numeric cector containing the IC values corresponding to all values of |
loadings |
Loadings obtained using method with sparsity parameter |
fit |
Fit obtained using method with sparsity parameter |
type |
Type of IC used: |
measure |
A numeric vector containing the standardised angles between the true and the estimated loadings matrix for each value of |
Author(s)
Tom Reynkens
References
Hubert, M., Reynkens, T., Schmitt, E. and Verdonck, T. (2016). “Sparse PCA for High-Dimensional Data with Outliers,” Technometrics, 58, 424–434.
Croux, C., Filzmoser, P., and Fritz, H. (2013), “Robust Sparse Principal Component Analysis,” Technometrics, 55, 202–214.
See Also
Examples
X <- dataGen(m=1, n=100, p=10, eps=0.2, bLength=4)$data[[1]]
sl <- selectLambda(X, k=2, method="ROSPCA", lstep=0.1)
selectPlot(sl)