diagram_distance {TDApplied}R Documentation

Calculate distance between a pair of persistence diagrams.

Description

Calculates the distance between a pair of persistence diagrams, either the output from a diagram_to_df function call or from a persistent homology calculation like ripsDiag/calculate_homology/PyH, in a particular homological dimension.

Usage

diagram_distance(
  D1,
  D2,
  dim = 0,
  p = 2,
  distance = "wasserstein",
  sigma = NULL,
  rho = NULL
)

Arguments

D1

the first persistence diagram.

D2

the second persistence diagram.

dim

the non-negative integer homological dimension in which the distance is to be computed, default 0.

p

a number representing the wasserstein power parameter, at least 1 and default 2.

distance

a string which determines which type of distance calculation to carry out, either "wasserstein" (default) or "fisher".

sigma

either NULL (default) or a positive number representing the bandwidth for the Fisher information metric.

rho

either NULL (default) or a positive number. If NULL then the exact calculation of the Fisher information metric is returned and otherwise a fast approximation, see details.

Details

The most common distance calculations between persistence diagrams are the wasserstein and bottleneck distances, both of which "match" points between their two input diagrams and compute the "loss" of the optimal matching (see https://dl.acm.org/doi/10.1145/3064175 for details). Another method for computing distances, the Fisher information metric, converts the two diagrams into distributions defined on the plane, and calculates a distance between the resulting two distributions (https://proceedings.neurips.cc/paper/2018/file/959ab9a0695c467e7caf75431a872e5c-Paper.pdf). If the 'distance' parameter is "fisher" then 'sigma' must not be NULL. As noted in the Persistence Fisher paper, there is a fast speed-up approximation which has been implemented from https://github.com/vmorariu/figtree and can be accessed by setting the 'rho' parameter. Smaller values of 'rho' will result in tighter approximations at the expense of longer runtime, and vice versa.

Value

the numeric value of the distance calculation.

Author(s)

Shael Brown - shaelebrown@gmail.com

References

Kerber M, Morozov D and Nigmetov A (2017). "Geometry Helps to Compare Persistence Diagrams." https://dl.acm.org/doi/10.1145/3064175.

Le T, Yamada M (2018). "Persistence fisher kernel: a riemannian manifold kernel for persistence diagrams." https://proceedings.neurips.cc/paper/2018/file/959ab9a0695c467e7caf75431a872e5c-Paper.pdf.

Vlad I. Morariu, Balaji Vasan Srinivasan, Vikas C. Raykar, Ramani Duraiswami, and Larry S. Davis. Automatic online tuning for fast Gaussian summation. Advances in Neural Information Processing Systems (NIPS), 2008.

See Also

distance_matrix for distance matrix calculations.

Examples


if(require("TDAstats"))
{
  # create two diagrams
  D1 <- TDAstats::calculate_homology(TDAstats::circle2d[sample(1:100,size = 20),],
                      dim = 1,threshold = 2)
  D2 <- TDAstats::calculate_homology(TDAstats::circle2d[sample(1:100,size = 20),],
                      dim = 1,threshold = 2)

  # calculate 2-wasserstein distance between D1 and D2 in dimension 1
  diagram_distance(D1,D2,dim = 1,p = 2,distance = "wasserstein")

  # calculate bottleneck distance between D1 and D2 in dimension 0
  diagram_distance(D1,D2,dim = 0,p = Inf,distance = "wasserstein")

  # Fisher information metric calculation between D1 and D2 for sigma = 1 in dimension 1
  diagram_distance(D1,D2,dim = 1,distance = "fisher",sigma = 1)
  
  # repeat but with fast approximation
  ## Not run: 
  diagram_distance(D1,D2,dim = 1,distance = "fisher",sigma = 1,rho = 0.001)
  
## End(Not run)
}

[Package TDApplied version 3.0.3 Index]