R: Kernel Principal Components Analysis

kernelpca {kpcaIG}

R Documentation

Kernel Principal Components Analysis

Description

Kernel Principal Components Analysis, a nonlinear version of principal component analysis obtrained through the so-called kernel trick.

Usage

kernelpca(data, kernel = "vanilladot", kpar = list(), features = 0)

Arguments

`data`	The data matrix organized by rows. Users should scale the data appropriately before applying this function, if relevant.
`kernel`	The kernel function used for the analysis. It can be chosen from the following strings: `'rbfdot'`: Radial Basis kernel function "Gaussian" `'polydot'`: Polynomial kernel function `'vanilladot'`: Linear kernel function `'tanhdot'`: Hyperbolic tangent kernel function
`kpar`	The list of hyper-parameters (kernel parameters) used with the kernel function. The valid parameters for each kernel type are as follows: `sigma`: inverse kernel width for the Radial Basis kernel function `"rbfdot"`. `degree`, `scale`, `offset` for the Polynomial kernel function `"polydot"`. `scale`, `offset` for the Hyperbolic tangent kernel function `"tanhdot"`.
`features`	The number of features (kernel principal components) to use for the analysis. Default: 0 , (all)

Value

kernelpca returns an S4 object of formal class kpca as in library(kernlab) containing the principal component vectors along with the corresponding eigenvalues.

`pcv`	pcv a matrix containing the principal component vectors (column wise)
`eig`	The corresponding eigenvalues
`rotated`	The original data projected (rotated) on the principal components
`xmatrix`	The original data matrix

References

Scholkopf B., Smola A. and Muller K.R. (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10, 1299-1319.

Examples

# Example
library(WallomicsData)
library(kpcaIG)
library(ggplot2)
library(kernlab)
Transcriptomics_Stems_s <- scale(Transcriptomics_Stems)

kpca_tan <-  kernelpca(as.matrix(Transcriptomics_Stems_s),
                          kernel = "tanhdot",
                          kpar = list(scale = 0.0001, offset = 0.01))


ggplot(data = data.frame(rotated(kpca_tan), Genetic_Cluster), 
       aes(x = X1, y = X2, shape = Genetic_Cluster)) +
  geom_point(size = 2, aes(color = Genetic_Cluster)) +
  xlab("1st kernel PC") +
  ylab("2nd kernel PC") +
  labs(color = "Genetic_Cluster", shape = "Genetic_Cluster") +
  theme_minimal()


ggplot(data = data.frame(rotated(kpca_tan), Ecotype), 
       aes(x = X1, y = X2, shape =  Ecotype)) +
  geom_point(size = 2, aes(color =  Ecotype)) +
  xlab("1st kernel PC") +
  ylab("2nd kernel PC") +
  labs(color = " Ecotype", shape = " Ecotype") +
  theme_minimal()

[Package kpcaIG version 1.0 Index]