kernelpca {kpcaIG}R Documentation

Kernel Principal Components Analysis

Description

Kernel Principal Components Analysis, a nonlinear version of principal component analysis obtrained through the so-called kernel trick.

Usage

kernelpca(data, kernel = "vanilladot", kpar = list(), features = 0)

Arguments

data

The data matrix organized by rows. Users should scale the data appropriately before applying this function, if relevant.

kernel

The kernel function used for the analysis. It can be chosen from the following strings:

  • 'rbfdot': Radial Basis kernel function "Gaussian"

  • 'polydot': Polynomial kernel function

  • 'vanilladot': Linear kernel function

  • 'tanhdot': Hyperbolic tangent kernel function

kpar

The list of hyper-parameters (kernel parameters) used with the kernel function. The valid parameters for each kernel type are as follows:

  • sigma: inverse kernel width for the Radial Basis kernel function "rbfdot".

  • degree, scale, offset for the Polynomial kernel function "polydot".

  • scale, offset for the Hyperbolic tangent kernel function "tanhdot".

features

The number of features (kernel principal components) to use for the analysis. Default: 0 , (all)

Value

kernelpca returns an S4 object of formal class kpca as in library(kernlab) containing the principal component vectors along with the corresponding eigenvalues.

pcv

pcv a matrix containing the principal component vectors (column wise)

eig

The corresponding eigenvalues

rotated

The original data projected (rotated) on the principal components

xmatrix

The original data matrix

References

Scholkopf B., Smola A. and Muller K.R. (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10, 1299-1319.

Examples

# Example
library(WallomicsData)
library(kpcaIG)
library(ggplot2)
library(kernlab)
Transcriptomics_Stems_s <- scale(Transcriptomics_Stems)

kpca_tan <-  kernelpca(as.matrix(Transcriptomics_Stems_s),
                          kernel = "tanhdot",
                          kpar = list(scale = 0.0001, offset = 0.01))


ggplot(data = data.frame(rotated(kpca_tan), Genetic_Cluster), 
       aes(x = X1, y = X2, shape = Genetic_Cluster)) +
  geom_point(size = 2, aes(color = Genetic_Cluster)) +
  xlab("1st kernel PC") +
  ylab("2nd kernel PC") +
  labs(color = "Genetic_Cluster", shape = "Genetic_Cluster") +
  theme_minimal()


ggplot(data = data.frame(rotated(kpca_tan), Ecotype), 
       aes(x = X1, y = X2, shape =  Ecotype)) +
  geom_point(size = 2, aes(color =  Ecotype)) +
  xlab("1st kernel PC") +
  ylab("2nd kernel PC") +
  labs(color = " Ecotype", shape = " Ecotype") +
  theme_minimal()



[Package kpcaIG version 1.0 Index]