R: whitens multivariate data

whiten {ForeCA}

R Documentation

whitens multivariate data

Description

whiten transforms a multivariate K-dimensional signal \mathbf{X} with mean \boldsymbol \mu_X and covariance matrix \Sigma_{X} to a whitened signal \mathbf{U} with mean \boldsymbol 0 and \Sigma_U = I_K. Thus it centers the signal and makes it contemporaneously uncorrelated. See Details.

check_whitened checks if data has been whitened; i.e., if it has zero mean, unit variance, and is uncorrelated.

sqrt_matrix computes the square root \mathbf{B} of a square matrix \mathbf{A}. The matrix \mathbf{B} satisfies \mathbf{B} \mathbf{B} = \mathbf{A}.

Usage

whiten(data)

check_whitened(data, check.attribute.only = TRUE)

sqrt_matrix(mat, return.sqrt.only = TRUE, symmetric = FALSE)

Arguments

`data`	`n \times K` array representing `n` observations of `K` variables.
`check.attribute.only`	logical; if `TRUE` it checks the attribute only. This is much faster (it just needs to look up one attribute value), but it might not surface silent bugs. For sake of performance the package uses the attribute version by default. However, for testing/debugging the full computational version can be used.
`mat`	a square `K \times K` matrix.
`return.sqrt.only`	logical; if `TRUE` (default) it returns only the square root matrix; if `FALSE` it returns other auxiliary results (eigenvectors and eigenvalues, and inverse of the square root matrix).
`symmetric`	logical; if `TRUE` the `eigen`-solver assumes that the matrix is symmetric (which makes it much faster). This is in particular useful for a covariance matrix (which is used in `whiten`). Default: `FALSE`.

Details

whiten uses zero component analysis (ZCA) (aka zero-phase whitening filters) to whiten the data; i.e., it uses the inverse square root of the covariance matrix of \mathbf{X} (see sqrt_matrix) as the whitening transformation. This means that on top of PCA, the uncorrelated principal components are back-transformed to the original space using the transpose of the eigenvectors. The advantage is that this makes them comparable to the original \mathbf{X}. See References for details.

The square root of a quadratic n \times n matrix \mathbf{A} can be computed by using the eigen-decomposition of \mathbf{A}

\mathbf{A} = \mathbf{V} \Lambda \mathbf{V}',

where \Lambda is an n \times n matrix with the eigenvalues \lambda_1, \ldots, \lambda_n in the diagonal. The square root is simply \mathbf{B} = \mathbf{V} \Lambda^{1/2} \mathbf{V}' where \Lambda^{1/2} = diag(\lambda_1^{1/2}, \ldots, \lambda_n^{1/2}).

Similarly, the inverse square root is defined as \mathbf{A}^{-1/2} = \mathbf{V} \Lambda^{-1/2} \mathbf{V}', where \Lambda^{-1/2} = diag(\lambda_1^{-1/2}, \ldots, \lambda_n^{-1/2}) (provided that \lambda_i \neq 0).

Value

whiten returns a list with the whitened data, the transformation, and other useful quantities.

check_whitened throws an error if the input is not whitened, and returns (invisibly) the data with an attribute 'whitened' equal to TRUE. This allows to simply update data to have the attribute and thus only check it once on the actual data (slow) but then use the attribute lookup (fast).

sqrt_matrix returns an n \times n matrix. If \mathbf{A} is not semi-positive definite it returns a complex-valued \mathbf{B} (since square root of negative eigenvalues are complex).

If return.sqrt.only = FALSE then it returns a list with:

`values`	eigenvalues of `\mathbf{A}`,
`vectors`	eigenvectors of `\mathbf{A}`,
`sqrt`	square root matrix `\mathbf{B}`,
`sqrt.inverse`	inverse of `\mathbf{B}`.

References

See appendix in http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.

See http://ufldl.stanford.edu/wiki/index.php/Implementing_PCA/Whitening.

Examples


## Not run: 
XX <- matrix(rnorm(100), ncol = 2) %*% matrix(runif(4), ncol = 2)
cov(XX)
UU <- whiten(XX)$U
cov(UU)

## End(Not run)

[Package ForeCA version 0.2.7 Index]