pairwise_compare {textreuse} | R Documentation |
Pairwise comparisons among documents in a corpus
Description
Given a TextReuseCorpus
containing documents of class
TextReuseTextDocument
, this function applies a comparison
function to every pairing of documents, and returns a matrix with the
comparison scores.
Usage
pairwise_compare(corpus, f, ..., directional = FALSE, progress = interactive())
Arguments
corpus |
|
f |
The function to apply to |
... |
Additional arguments passed to |
directional |
Some comparison functions are commutative, so that
|
progress |
Display a progress bar while comparing documents. |
Value
A square matrix with dimensions equal to the length of the corpus,
and row and column names set by the names of the documents in the corpus. A
value of NA
in the matrix indicates that a comparison was not made.
In cases of directional comparisons, then the comparison reported is
f(row, column)
.
See Also
See these document comparison functions,
jaccard_similarity
, ratio_of_matches
.
Examples
dir <- system.file("extdata/legal", package = "textreuse")
corpus <- TextReuseCorpus(dir = dir)
names(corpus) <- filenames(names(corpus))
# A non-directional comparison
pairwise_compare(corpus, jaccard_similarity)
# A directional comparison
pairwise_compare(corpus, ratio_of_matches, directional = TRUE)