pairwise_delta {widyr} | R Documentation |
Delta measure of pairs of documents
Description
Compute the delta distances (from its two variants) of all pairs of documents in a tidy table.
Usage
pairwise_delta(tbl, item, feature, value, method = "burrows", ...)
pairwise_delta_(tbl, item, feature, value, method = "burrows", ...)
Arguments
tbl |
Table |
item |
Item to compare; will end up in |
feature |
Column describing the feature that links one item to others |
value |
Value |
method |
Distance measure to be used; see |
... |
Extra arguments passed on to |
See Also
Examples
library(janeaustenr)
library(dplyr)
library(tidytext)
# closest documents in terms of 1000 most frequent words
closest <- austen_books() %>%
unnest_tokens(word, text) %>%
count(book, word) %>%
top_n(1000, n) %>%
pairwise_delta(book, word, n, method = "burrows") %>%
arrange(delta)
closest
closest %>%
filter(item1 == "Pride & Prejudice")
# to remove duplicates, use upper = FALSE
closest <- austen_books() %>%
unnest_tokens(word, text) %>%
count(book, word) %>%
top_n(1000, n) %>%
pairwise_delta(book, word, n, method = "burrows", upper = FALSE) %>%
arrange(delta)
# Can also use Argamon's Linear Delta
closest <- austen_books() %>%
unnest_tokens(word, text) %>%
count(book, word) %>%
top_n(1000, n) %>%
pairwise_delta(book, word, n, method = "argamon", upper = FALSE) %>%
arrange(delta)
[Package widyr version 0.1.5 Index]