R: Dimension-Reduction Transformations for Training or Sampling

trafos_dimreduction {gnn}

R Documentation

Dimension-Reduction Transformations for Training or Sampling

Description

Dimension-reduction transformations applied to an input data matrix. Currently on the principal component transformation and its inverse.

Usage

PCA_trafo(x, mu, Gamma, inverse = FALSE, ...)

Arguments

`x`	`(n, d)`-matrix of data (typically before training or after sampling). If `inverse = FALSE`, then, conceptually, an `(n, d)`-matrix with `1\le k \le d`, where `d` is the dimension of the original data whose dimension was reduced to `k`.
`mu`	if `inverse = TRUE`, a `d`-vector of centers, where `d` is the dimension to transform `x` to.
`Gamma`	if `inverse = TRUE`, a `(d, k)`-matrix with `k` at least as large as `ncol(x)` containing the `k` orthonormal eigenvectors of a covariance matrix sorted in decreasing order of their eigenvalues; in other words, the columns of `Gamma` contain principal axes or loadings. If a matrix with `k` greater than `ncol(x)` is provided, only the first `k`-many are considered.
`inverse`	`logical` indicating whether the inverse transformation of the principal component transformation is applied.
`...`	additional arguments passed to the underlying `prcomp()`.

Details

Conceptually, the principal component transformation transforms a vector \bm{X} to a vector \bm{Y} where \bm{Y} = \Gamma^T(\bm{X}-\bm{\mu}), where \bm{\mu} is the mean vector of \bm{X} and \Gamma is the (d, d)-matrix whose columns contains the orthonormal eigenvectors of cov(X).

The corresponding (conceptual) inverse transformation is \bm{X} = \bm{\mu} + \Gamma \bm{Y}.

See McNeil et al. (2015, Section 6.4.5).

Value

If inverse = TRUE, the transformed data whose rows contain \bm{X} = \bm{\mu} + \Gamma \bm{Y}, where Y is one row of x. See the details below for the notation.

If inverse = FALSE, a list containing:

PCs:: (n, d)-matrix of principal components.
cumvar:: cumulative variances; the jth entry provides the fraction of the explained variance of the first j principal components.
sd:: sample standard deviations of the transformed data.
lambda:: eigenvalues of cov(x).
mu:: d-vector of centers of x (see also above) typically provided to PCA_trafo(, inverse = TRUE).
Gamma:: (d, d)-matrix of principal axes (see also above) typically provided to PCA_trafo(, inverse = TRUE).

Author(s)

Marius Hofert

References

McNeil, A. J., Frey, R., and Embrechts, P. (2015). Quantitative Risk Management: Concepts, Techniques, Tools. Princeton University Press.

Examples

library(gnn) # for being standalone

## Generate data
library(copula)
set.seed(271)
X <- qt(rCopula(1000, gumbelCopula(2, dim = 10)), df = 3.5)
pairs(X, gap = 0, pch = ".")

## Principal component transformation
PCA <- PCA_trafo(X)
Y <- PCA$PCs
PCA$cumvar[3] # fraction of variance explained by the first 3 principal components
which.max(PCA$cumvar > 0.9) # number of principal components it takes to explain 90%

## Biplot (plot of the first two principal components = data transformed with
## the first two principal axes)
plot(Y[,1:2])

## Transform back and compare
X. <- PCA_trafo(Y, mu = PCA$mu, Gamma = PCA$Gamma, inverse = TRUE)
stopifnot(all.equal(X., X))

## Note: One typically transforms back with only some of the principal axes
X. <- PCA_trafo(Y[,1:3], mu = PCA$mu, # mu determines the dimension to transform to
                Gamma = PCA$Gamma, # must be of dim. (length(mu), k) for k >= ncol(x)
                inverse = TRUE)
stopifnot(dim(X.) == c(1000, 10))
## Note: We (typically) transform back to the original dimension.
pairs(X., gap = 0, pch = ".") # pairs of back-transformed first three PCs

[Package gnn version 0.0-4 Index]