R: Find an optimal matching between two sets of signatures...

match_two_sig_sets {mSigTools}

R Documentation

Find an optimal matching between two sets of signatures subject to a maximum distance.

Description

Find an optimal matching between two sets of signatures subject to a maximum distance.

Usage

match_two_sig_sets(
  x1,
  x2,
  method = "cosine",
  convert.sim.to.dist = function(x) {
     return(1 - x)
 },
  cutoff = 0.9
)

Arguments

`x1`	A numerical-matrix-like object with columns as signatures.
`x2`	A numerical-matrix-like object with columns as signatures. Needs to have the same number of rows as `x1`.
`method`	As for the `distance` function in package `philenropy`.
`convert.sim.to.dist`	If `method` specifies a similarity rather than a distance, use this function to convert the similarity to a distance.
`cutoff`	A maximum distance or minimum similarity over which to pair signatures between `x1` and `x2`.

Details

Match signatures between x1 and x2 using the function solve_LSAP, which uses the "Hungarian" (a.k.a "Kuhn–Munkres") algorithm https://en.wikipedia.org/wiki/Hungarian_algorithm, which optimizes the total cost associated with the links between nodes. This function generates a distance matrix between the two sets of signatures using method and, if necessary, convert.sim.to.dist. It then sets distances > cutoff to very large values and then applies solve_LSAP to the resulting matrix to compute a matching between x1 and x2 that minimizes the sum of the distances.

Value

A list with the elements

table Table of extracted signatures that matched a reference signature. Each row contains the extracted signature name, the reference signature name, and the distance of the match.
orig.matrix The matrix of numeric distances between x1 and x2.
modified.matrix The argument orig.matrix with distances > cutoff changed to very large values.

Examples

ex.sigs <- matrix(c(0.2, 0.8, 0.3, 0.7, 0.6, 0.4), nrow = 2)
colnames(ex.sigs) <- c("ex1", "ex2", "ex3")
ref.sigs <- matrix(c(0.21, 0.79, 0.19, 0.81), nrow = 2)
colnames(ref.sigs) <- c("ref1", "ref2")
match_two_sig_sets(ex.sigs, ref.sigs, cutoff = .9)

[Package mSigTools version 1.0.7 Index]