lsh_query {textreuse} | R Documentation |
Query a LSH cache for matches to a single document
Description
This function retrieves the matches for a single document from an lsh_buckets
object created by lsh
. See lsh_candidates
to retrieve all pairs of matches.
Usage
lsh_query(buckets, id)
Arguments
buckets |
An |
id |
The document ID to find matches for. |
Value
An lsh_candidates
data frame with matches to the document specified.
See Also
Examples
dir <- system.file("extdata/legal", package = "textreuse")
minhash <- minhash_generator(200, seed = 235)
corpus <- TextReuseCorpus(dir = dir,
tokenizer = tokenize_ngrams, n = 5,
minhash_func = minhash)
buckets <- lsh(corpus, bands = 50)
lsh_query(buckets, "ny1850-match")
[Package textreuse version 0.1.5 Index]