ICPCA {cellWise} R Documentation

## Iterative Classical PCA

### Description

This function carries out classical PCA when the data may contain missing values, by an iterative algorithm. It is based on a Matlab function from the Missing Data Imputation Toolbox v1.0 by A. Folch-Fortuny, F. Arteaga and A. Ferrer.

### Usage

```ICPCA(X, k, scale = FALSE, maxiter = 20, tol = 0.005,
tolProb = 0.99, distprob = 0.99)
```

### Arguments

 `X` the input data, which must be a matrix or a data frame. It may contain NA's. It must always be provided. `k` the desired number of principal components `scale` a value indicating whether and how the original variables should be scaled. If `scale=FALSE` (default) or `scale=NULL` no scaling is performed (and a vector of 1s is returned in the `\$scaleX` slot). If `scale=TRUE` the variables are scaled to have a standard deviation of 1. Alternatively scale can be a function like mad, or a vector of length equal to the number of columns of x. The resulting scale estimates are returned in the `\$scaleX` slot of the output. `maxiter` maximum number of iterations. Default is 20. `tol` tolerance for iterations. Default is 0.005. `tolProb` tolerance probability for residuals. Defaults to 0.99. `distprob` probability determining the cutoff values for orthogonal and score distances. Default is 0.99.

### Value

A list with components:

 `scaleX` the scales of the columns of X. `k` the number of principal components. `loadings` the columns are the k loading vectors. `eigenvalues` the k eigenvalues. `center` vector with the fitted center. `covmatrix` estimated covariance matrix. `It` number of iteration steps. `diff` convergence criterion. `X.NAimp` data with all NA's imputed. `scores` scores of X.NAimp. `OD` orthogonal distances of the rows of X.NAimp. `cutoffOD` cutoff value for the OD. `SD` score distances of the rows of X.NAimp. `cutoffSD` cutoff value for the SD. `indrows` row numbers of rowwise outliers. `residScale` scale of the residuals. `stdResid` standardized residuals. Note that these are NA for all missing values of `X`. `indcells` indices of cellwise outliers.

### Author(s)

Wannes Van Den Bossche

### References

Folch-Fortuny, A., Arteaga, F., Ferrer, A. (2016). Missing Data Imputation Toolbox for MATLAB. Chemometrics and Intelligent Laboratory Systems, 154, 93-100.

### Examples

```library(MASS)
set.seed(12345)
n <- 100; d <- 10
A <- diag(d) * 0.1 + 0.9
x <- mvrnorm(n, rep(0,d), A)
x[sample(1:(n * d), 100, FALSE)] <- NA
ICPCA.out <- ICPCA(x, k = 2)
plot(ICPCA.out\$scores)
```

[Package cellWise version 2.2.5 Index]