diagram_distance {TDApplied} | R Documentation |
Calculate distance between a pair of persistence diagrams.
Description
Calculates the distance between a pair of persistence diagrams, either the output from a diagram_to_df
function call
or from a persistent homology calculation like ripsDiag/calculate_homology
/PyH
,
in a particular homological dimension.
Usage
diagram_distance(
D1,
D2,
dim = 0,
p = 2,
distance = "wasserstein",
sigma = NULL,
rho = NULL
)
Arguments
D1 |
the first persistence diagram. |
D2 |
the second persistence diagram. |
dim |
the non-negative integer homological dimension in which the distance is to be computed, default 0. |
p |
a number representing the wasserstein power parameter, at least 1 and default 2. |
distance |
a string which determines which type of distance calculation to carry out, either "wasserstein" (default) or "fisher". |
sigma |
either NULL (default) or a positive number representing the bandwidth for the Fisher information metric. |
rho |
either NULL (default) or a positive number. If NULL then the exact calculation of the Fisher information metric is returned and otherwise a fast approximation, see details. |
Details
The most common distance calculations between persistence diagrams are the wasserstein and bottleneck distances, both of which "match" points between their two input diagrams and compute the "loss" of the optimal matching (see https://dl.acm.org/doi/10.1145/3064175 for details). Another method for computing distances, the Fisher information metric, converts the two diagrams into distributions defined on the plane, and calculates a distance between the resulting two distributions (https://proceedings.neurips.cc/paper/2018/file/959ab9a0695c467e7caf75431a872e5c-Paper.pdf). If the 'distance' parameter is "fisher" then 'sigma' must not be NULL. As noted in the Persistence Fisher paper, there is a fast speed-up approximation which has been implemented from https://github.com/vmorariu/figtree and can be accessed by setting the 'rho' parameter. Smaller values of 'rho' will result in tighter approximations at the expense of longer runtime, and vice versa.
Value
the numeric value of the distance calculation.
Author(s)
Shael Brown - shaelebrown@gmail.com
References
Kerber M, Morozov D and Nigmetov A (2017). "Geometry Helps to Compare Persistence Diagrams." https://dl.acm.org/doi/10.1145/3064175.
Le T, Yamada M (2018). "Persistence fisher kernel: a riemannian manifold kernel for persistence diagrams." https://proceedings.neurips.cc/paper/2018/file/959ab9a0695c467e7caf75431a872e5c-Paper.pdf.
Vlad I. Morariu, Balaji Vasan Srinivasan, Vikas C. Raykar, Ramani Duraiswami, and Larry S. Davis. Automatic online tuning for fast Gaussian summation. Advances in Neural Information Processing Systems (NIPS), 2008.
See Also
distance_matrix
for distance matrix calculations.
Examples
if(require("TDAstats"))
{
# create two diagrams
D1 <- TDAstats::calculate_homology(TDAstats::circle2d[sample(1:100,size = 20),],
dim = 1,threshold = 2)
D2 <- TDAstats::calculate_homology(TDAstats::circle2d[sample(1:100,size = 20),],
dim = 1,threshold = 2)
# calculate 2-wasserstein distance between D1 and D2 in dimension 1
diagram_distance(D1,D2,dim = 1,p = 2,distance = "wasserstein")
# calculate bottleneck distance between D1 and D2 in dimension 0
diagram_distance(D1,D2,dim = 0,p = Inf,distance = "wasserstein")
# Fisher information metric calculation between D1 and D2 for sigma = 1 in dimension 1
diagram_distance(D1,D2,dim = 1,distance = "fisher",sigma = 1)
# repeat but with fast approximation
## Not run:
diagram_distance(D1,D2,dim = 1,distance = "fisher",sigma = 1,rho = 0.001)
## End(Not run)
}