h2o.stringdist {h2o} | R Documentation |
Compute element-wise string distances between two H2OFrames
Description
Compute element-wise string distances between two H2OFrames. Both frames need to have the same shape (N x M) and only contain string/factor columns. Return a matrix (H2OFrame) of shape N x M.
Usage
h2o.stringdist(
x,
y,
method = c("lv", "lcs", "qgram", "jaccard", "jw", "soundex"),
compare_empty = TRUE
)
Arguments
x |
An H2OFrame |
y |
A comparison H2OFrame |
method |
A string identifier indicating what string distance measure to use. Must be one of: "lv" - Levenshtein distance "lcs" - Longest common substring distance "qgram" - q-gram distance "jaccard" - Jaccard distance between q-gram profiles "jw" - Jaro, or Jaro-Winker distance "soundex" - Distance based on soundex encoding |
compare_empty |
if set to FALSE, empty strings will be handled as NaNs |
Examples
## Not run:
h2o.init()
x <- as.h2o(c("Martha", "Dwayne", "Dixon"))
y <- as.character(as.h2o(c("Marhta", "Duane", "Dicksonx")))
h2o.stringdist(x, y, method = "jw")
## End(Not run)
[Package h2o version 3.44.0.3 Index]