Lookup {comparator}R Documentation

Lookup String Comparator

Description

Compares a pair of strings x and y by retrieving their distance/similarity score from a provided lookup table.

Usage

Lookup(
  lookup_table,
  values_colnames,
  score_colname,
  default_match = 0,
  default_nonmatch = NA_real_,
  symmetric = TRUE,
  ignore_case = FALSE
)

Arguments

lookup_table

data frame containing distances/similarities for pairs of values

values_colnames

character vector containing the colnames corresponding to pairs of values (e.g. strings) in lookup_table

score_colname

name of column that contains distances/similarities in lookup_table

default_match

distance/similarity to use if the pair of values match exactly and do not appear in lookup_table. Defaults to 0.0.

default_nonmatch

distance/similarity to use if the pair of values are not an exact match and do not appear in ⁠lookup table⁠. Defaults to NA.

symmetric

whether the underlying distance/similarity scores are symmetric. If TRUE lookup_table need only contain entries for one of the two pairs—i.e. an entry for value pair (y, x) is not required if an entry for (x, y) is already present.

ignore_case

a logical. If TRUE, case is ignored when comparing the strings.

Details

The lookup table should contain three columns corresponding to x, and y (values_colnames below) and the distance/similarity (score_colname below). If a pair of values x and y is not in the lookup table, a default distance/similarity is returned depending on whether x = y (default_match below) or x \neq y (default_nonmatch below).

Value

A Lookup instance is returned, which is an S4 class inheriting from StringComparator.

Examples

## Measure the distance between cities
lookup_table <- data.frame(x = c("Melbourne", "Melbourne", "Sydney"), 
                           y = c("Sydney", "Brisbane", "Brisbane"), 
                           dist = c(713.4, 1374.8, 732.5))

comparator <- Lookup(lookup_table, c("x", "y"), "dist")
comparator("Sydney", "Melbourne")
comparator("Melbourne", "Perth")


[Package comparator version 0.1.2 Index]