R: Calculate the kernel PCA embedding of a group of persistence...

diagram_kpca {TDApplied}

R Documentation

Calculate the kernel PCA embedding of a group of persistence diagrams.

Description

Project a group of persistence diagrams into a low-dimensional embedding space using a kernelized version of the popular PCA algorithm.

Usage

diagram_kpca(
  diagrams,
  K = NULL,
  dim = 0,
  t = 1,
  sigma = 1,
  rho = NULL,
  features = 1,
  num_workers = parallelly::availableCores(omit = 1),
  th = 1e-04
)

Arguments

`diagrams`	a list of persistence diagrams which are either the output of a persistent homology calculation like ripsDiag/`calculate_homology`/`PyH`, or `diagram_to_df`.
`K`	an optional precomputed Gram matrix of the persistence diagrams in 'diagrams', default NULL.
`dim`	the non-negative integer homological dimension in which the distance is to be computed, default 0.
`t`	a positive number representing the scale for the persistence Fisher kernel, default 1.
`sigma`	a positive number representing the bandwidth for the Fisher information metric, default 1.
`rho`	an optional positive number representing the heuristic for Fisher information metric approximation, see `diagram_distance`. Default NULL. If supplied, Gram matrix calculation is sequential.
`features`	number of features (principal components) to return, default 1.
`num_workers`	the number of cores used for parallel computation, default is one less than the number of cores on the machine.
`th`	the threshold value under which principal components are ignored (default 0.0001).

Details

Returns the output of kernlab's kpca function on the desired Gram matrix of a group of persistence diagrams in a particular dimension. The prediction function predict_diagram_kpca can be used to project new persistence diagrams using an old embedding, and this could be one practical advantage of using diagram_kpca over diagram_mds. The embedding coordinates can also be used for further analysis, or simply as a data visualization tool for persistence diagrams.

Value

a list of class 'diagram_kpca' containing the elements

pca: the output of kernlab's kpca function on the Gram matrix: an S4 object containing the slots 'pcv' (a matrix containing the principal component vectors (column wise)), 'eig' (the corresponding eigenvalues), 'rotated' (the original data projected (rotated) on the principal components) and 'xmatrix' (the original data matrix).
diagrams: the input 'diagrams' argument.
t: the input 't' argument.
sigma: the input 'sigma' argument.
dim: the input 'dim' argument.

Author(s)

Shael Brown - shaelebrown@gmail.com

References

Scholkopf, B and Smola, A and Muller, K (1998). "Nonlinear Component Analysis as a Kernel Eigenvalue Problem." https://www.mlpack.org/papers/kpca.pdf.

Examples


if(require("TDAstats"))
{
  # create six diagrams
  D1 <- TDAstats::calculate_homology(TDAstats::circle2d[sample(1:100,20),],
                      dim = 1,threshold = 2)
  D2 <- TDAstats::calculate_homology(TDAstats::circle2d[sample(1:100,20),],
                      dim = 1,threshold = 2)
  D3 <- TDAstats::calculate_homology(TDAstats::sphere3d[sample(1:100,20),],
                      dim = 1,threshold = 2)
  D4 <- TDAstats::calculate_homology(TDAstats::sphere3d[sample(1:100,20),],
                      dim = 1,threshold = 2)
  D5 <- TDAstats::calculate_homology(TDAstats::sphere3d[sample(1:100,20),],
                      dim = 1,threshold = 2)
  D6 <- TDAstats::calculate_homology(TDAstats::sphere3d[sample(1:100,20),],
                      dim = 1,threshold = 2)
  g <- list(D1,D2,D3,D4,D5,D6)

  # calculate their 2D PCA embedding with sigma = t = 2 in dimension 1
  pca <- diagram_kpca(diagrams = g,dim = 1,t = 2,sigma = 2,features = 2,num_workers = 2,th = 1e-6)
  
  # repeat with precomputed Gram matrix, gives same result but much faster
  K <- gram_matrix(diagrams = g,dim = 1,t = 2,sigma = 2,num_workers = 2)
  pca <- diagram_kpca(diagrams = g,K = K,dim = 1,t = 2,sigma = 2,features = 2,th = 1e-6)
  
}

[Package TDApplied version 3.0.3 Index]