diagram_kpca {TDApplied}R Documentation

Calculate the kernel PCA embedding of a group of persistence diagrams.

Description

Project a group of persistence diagrams into a low-dimensional embedding space using a kernelized version of the popular PCA algorithm.

Usage

diagram_kpca(
  diagrams,
  K = NULL,
  dim = 0,
  t = 1,
  sigma = 1,
  rho = NULL,
  features = 1,
  num_workers = parallelly::availableCores(omit = 1),
  th = 1e-04
)

Arguments

diagrams

a list of persistence diagrams which are either the output of a persistent homology calculation like ripsDiag/calculate_homology/PyH, or diagram_to_df.

K

an optional precomputed Gram matrix of the persistence diagrams in 'diagrams', default NULL.

dim

the non-negative integer homological dimension in which the distance is to be computed, default 0.

t

a positive number representing the scale for the persistence Fisher kernel, default 1.

sigma

a positive number representing the bandwidth for the Fisher information metric, default 1.

rho

an optional positive number representing the heuristic for Fisher information metric approximation, see diagram_distance. Default NULL. If supplied, Gram matrix calculation is sequential.

features

number of features (principal components) to return, default 1.

num_workers

the number of cores used for parallel computation, default is one less than the number of cores on the machine.

th

the threshold value under which principal components are ignored (default 0.0001).

Details

Returns the output of kernlab's kpca function on the desired Gram matrix of a group of persistence diagrams in a particular dimension. The prediction function predict_diagram_kpca can be used to project new persistence diagrams using an old embedding, and this could be one practical advantage of using diagram_kpca over diagram_mds. The embedding coordinates can also be used for further analysis, or simply as a data visualization tool for persistence diagrams.

Value

a list of class 'diagram_kpca' containing the elements

pca

the output of kernlab's kpca function on the Gram matrix: an S4 object containing the slots 'pcv' (a matrix containing the principal component vectors (column wise)), 'eig' (the corresponding eigenvalues), 'rotated' (the original data projected (rotated) on the principal components) and 'xmatrix' (the original data matrix).

diagrams

the input 'diagrams' argument.

t

the input 't' argument.

sigma

the input 'sigma' argument.

dim

the input 'dim' argument.

Author(s)

Shael Brown - shaelebrown@gmail.com

References

Scholkopf, B and Smola, A and Muller, K (1998). "Nonlinear Component Analysis as a Kernel Eigenvalue Problem." https://www.mlpack.org/papers/kpca.pdf.

See Also

predict_diagram_kpca for predicting embedding coordinates of new diagrams.

Examples


if(require("TDAstats"))
{
  # create six diagrams
  D1 <- TDAstats::calculate_homology(TDAstats::circle2d[sample(1:100,20),],
                      dim = 1,threshold = 2)
  D2 <- TDAstats::calculate_homology(TDAstats::circle2d[sample(1:100,20),],
                      dim = 1,threshold = 2)
  D3 <- TDAstats::calculate_homology(TDAstats::sphere3d[sample(1:100,20),],
                      dim = 1,threshold = 2)
  D4 <- TDAstats::calculate_homology(TDAstats::sphere3d[sample(1:100,20),],
                      dim = 1,threshold = 2)
  D5 <- TDAstats::calculate_homology(TDAstats::sphere3d[sample(1:100,20),],
                      dim = 1,threshold = 2)
  D6 <- TDAstats::calculate_homology(TDAstats::sphere3d[sample(1:100,20),],
                      dim = 1,threshold = 2)
  g <- list(D1,D2,D3,D4,D5,D6)

  # calculate their 2D PCA embedding with sigma = t = 2 in dimension 1
  pca <- diagram_kpca(diagrams = g,dim = 1,t = 2,sigma = 2,features = 2,num_workers = 2,th = 1e-6)
  
  # repeat with precomputed Gram matrix, gives same result but much faster
  K <- gram_matrix(diagrams = g,dim = 1,t = 2,sigma = 2,num_workers = 2)
  pca <- diagram_kpca(diagrams = g,K = K,dim = 1,t = 2,sigma = 2,features = 2,th = 1e-6)
  
}

[Package TDApplied version 3.0.3 Index]