trafos_dimreduction {gnn}R Documentation

Dimension-Reduction Transformations for Training or Sampling

Description

Dimension-reduction transformations applied to an input data matrix. Currently on the principal component transformation and its inverse.

Usage

PCA_trafo(x, mu, Gamma, inverse = FALSE, ...)

Arguments

x

(n,d)(n, d)-matrix of data (typically before training or after sampling). If inverse = FALSE, then, conceptually, an (n,d)(n, d)-matrix with 1kd1\le k \le d, where dd is the dimension of the original data whose dimension was reduced to kk.

mu

if inverse = TRUE, a dd-vector of centers, where dd is the dimension to transform x to.

Gamma

if inverse = TRUE, a (d,k)(d, k)-matrix with kk at least as large as ncol(x) containing the kk orthonormal eigenvectors of a covariance matrix sorted in decreasing order of their eigenvalues; in other words, the columns of Gamma contain principal axes or loadings. If a matrix with kk greater than ncol(x) is provided, only the first kk-many are considered.

inverse

logical indicating whether the inverse transformation of the principal component transformation is applied.

...

additional arguments passed to the underlying prcomp().

Details

Conceptually, the principal component transformation transforms a vector X\bm{X} to a vector Y\bm{Y} where Y=ΓT(Xμ)\bm{Y} = \Gamma^T(\bm{X}-\bm{\mu}), where μ\bm{\mu} is the mean vector of X\bm{X} and Γ\Gamma is the (d,d)(d, d)-matrix whose columns contains the orthonormal eigenvectors of cov(X).

The corresponding (conceptual) inverse transformation is X=μ+ΓY\bm{X} = \bm{\mu} + \Gamma \bm{Y}.

See McNeil et al. (2015, Section 6.4.5).

Value

If inverse = TRUE, the transformed data whose rows contain X=μ+ΓY\bm{X} = \bm{\mu} + \Gamma \bm{Y}, where YY is one row of x. See the details below for the notation.

If inverse = FALSE, a list containing:

PCs:

(n,d)(n, d)-matrix of principal components.

cumvar:

cumulative variances; the jjth entry provides the fraction of the explained variance of the first jj principal components.

sd:

sample standard deviations of the transformed data.

lambda:

eigenvalues of cov(x).

mu:

dd-vector of centers of x (see also above) typically provided to PCA_trafo(, inverse = TRUE).

Gamma:

(d,d)(d, d)-matrix of principal axes (see also above) typically provided to PCA_trafo(, inverse = TRUE).

Author(s)

Marius Hofert

References

McNeil, A. J., Frey, R., and Embrechts, P. (2015). Quantitative Risk Management: Concepts, Techniques, Tools. Princeton University Press.

Examples

library(gnn) # for being standalone

## Generate data
library(copula)
set.seed(271)
X <- qt(rCopula(1000, gumbelCopula(2, dim = 10)), df = 3.5)
pairs(X, gap = 0, pch = ".")

## Principal component transformation
PCA <- PCA_trafo(X)
Y <- PCA$PCs
PCA$cumvar[3] # fraction of variance explained by the first 3 principal components
which.max(PCA$cumvar > 0.9) # number of principal components it takes to explain 90%

## Biplot (plot of the first two principal components = data transformed with
## the first two principal axes)
plot(Y[,1:2])

## Transform back and compare
X. <- PCA_trafo(Y, mu = PCA$mu, Gamma = PCA$Gamma, inverse = TRUE)
stopifnot(all.equal(X., X))

## Note: One typically transforms back with only some of the principal axes
X. <- PCA_trafo(Y[,1:3], mu = PCA$mu, # mu determines the dimension to transform to
                Gamma = PCA$Gamma, # must be of dim. (length(mu), k) for k >= ncol(x)
                inverse = TRUE)
stopifnot(dim(X.) == c(1000, 10))
## Note: We (typically) transform back to the original dimension.
pairs(X., gap = 0, pch = ".") # pairs of back-transformed first three PCs

[Package gnn version 0.0-4 Index]