diagram_mds {TDApplied} | R Documentation |
Dimension reduction of a group of persistence diagrams via metric multidimensional scaling.
Description
Projects a group of persistence diagrams (or a precomputed distance matrix of diagrams) into a low-dimensional embedding space via metric multidimensional scaling. Such a projection can be used for visualization of data, or a static analysis of the embedding dimensions.
Usage
diagram_mds(
diagrams,
D = NULL,
k = 2,
distance = "wasserstein",
dim = 0,
p = 2,
sigma = NULL,
rho = NULL,
eig = FALSE,
add = FALSE,
x.ret = FALSE,
list. = eig || add || x.ret,
num_workers = parallelly::availableCores(omit = 1)
)
Arguments
diagrams |
a list of n>=2 persistence diagrams which are either the output of a persistent homology calculation like ripsDiag/ |
D |
an optional precomputed distance matrix of persistence diagrams, default NULL. If not NULL then 'diagrams' parameter does not need to be supplied. |
k |
the dimension of the space which the data are to be represented in; must be in {1,2,...,n-1}. |
distance |
a string representing the desired distance metric to be used, either 'wasserstein' (default) or 'fisher'. |
dim |
the non-negative integer homological dimension in which the distance is to be computed, default 0. |
p |
a positive number representing the wasserstein power, a number at least 1 (infinity for the bottleneck distance), default 2. |
sigma |
a positive number representing the bandwidth for the Fisher information metric, default NULL. |
rho |
an optional positive number representing the heuristic for Fisher information metric approximation, see |
eig |
a boolean indicating whether the eigenvalues should be returned. |
add |
a boolean indicating if an additive constant c* should be computed, and added to the non-diagonal dissimilarities such that the modified dissimilarities are Euclidean. |
x.ret |
a boolean indicating whether the doubly centered symmetric distance matrix should be returned. |
list. |
a boolean indicating if a list should be returned or just the n*k matrix. |
num_workers |
the number of cores used for parallel computation, default is one less than the number of cores on the machine. |
Details
Returns the output of cmdscale
on the desired distance matrix of a group of persistence diagrams
in a particular dimension. If 'distance' is "fisher" then 'sigma' must not be NULL.
Value
the output of cmdscale
on the diagram distance matrix. If 'list.' is false (as per default),
a matrix with 'k' columns whose rows give the coordinates of the points chosen to represent the dissimilarities.
Otherwise, a list containing the following components.
- points
a matrix with 'k' columns whose rows give the coordinates of the points chosen to represent the dissimilarities.
- eig
the
n
eigenvalues computed during the scaling process if 'eig' is true.- x
the doubly centered distance matrix if 'x.ret' is true.
- ac
the additive constant
c*
, 0 if 'add' = FALSE.- GOF
the numeric vector of length 2, representing the sum of all the eigenvalues divided by the sum of their absolute values (first vector element) or by the sum of the max of each eigenvalue and 0 (second vector element).
Author(s)
Shael Brown - shaelebrown@gmail.com
References
Cox M and Cox F (2008). "Multidimensional Scaling." doi: 10.1007/978-3-540-33037-0_14.
Examples
if(require("TDAstats"))
{
# create two diagrams
D1 <- TDAstats::calculate_homology(TDAstats::circle2d[sample(1:100,10),],
dim = 1,threshold = 2)
D2 <- TDAstats::calculate_homology(TDAstats::circle2d[sample(1:100,10),],
dim = 1,threshold = 2)
g <- list(D1,D2)
# calculate their 1D MDS embedding in dimension 0 with the bottleneck distance
mds <- diagram_mds(diagrams = g,k = 1,dim = 0,p = Inf,num_workers = 2)
# repeat but with a precomputed distance matrix, gives same result just much faster
Dmat <- distance_matrix(diagrams = list(D1,D2),dim = 0,p = Inf,num_workers = 2)
mds <- diagram_mds(D = Dmat,k = 1)
}