Lookup {comparator} | R Documentation |
Lookup String Comparator
Description
Compares a pair of strings x
and y
by retrieving
their distance/similarity score from a provided lookup table.
Usage
Lookup(
lookup_table,
values_colnames,
score_colname,
default_match = 0,
default_nonmatch = NA_real_,
symmetric = TRUE,
ignore_case = FALSE
)
Arguments
lookup_table |
data frame containing distances/similarities for pairs of values |
values_colnames |
character vector containing the colnames
corresponding to pairs of values (e.g. strings) in |
score_colname |
name of column that contains distances/similarities
in |
default_match |
distance/similarity to use if the pair of values
match exactly and do not appear in |
default_nonmatch |
distance/similarity to use if the pair of values are
not an exact match and do not appear in |
symmetric |
whether the underlying distance/similarity scores are
symmetric. If TRUE |
ignore_case |
a logical. If TRUE, case is ignored when comparing the strings. |
Details
The lookup table should contain three columns corresponding to x
,
and y
(values_colnames
below) and the distance/similarity
(score_colname
below). If a pair of values x
and y
is
not in the lookup table, a default distance/similarity is returned
depending on whether x = y
(default_match
below) or
x \neq y
(default_nonmatch
below).
Value
A Lookup
instance is returned, which is an S4 class inheriting from
StringComparator
.
Examples
## Measure the distance between cities
lookup_table <- data.frame(x = c("Melbourne", "Melbourne", "Sydney"),
y = c("Sydney", "Brisbane", "Brisbane"),
dist = c(713.4, 1374.8, 732.5))
comparator <- Lookup(lookup_table, c("x", "y"), "dist")
comparator("Sydney", "Melbourne")
comparator("Melbourne", "Perth")