word_proximity {qdap} | R Documentation |
Proximity Matrix Between Words
Description
word_proximity
- Generate proximity measures to ascertain a mean
distance measure between word uses.
Usage
word_proximity(
text.var,
terms,
grouping.var = NULL,
parallel = TRUE,
cores = parallel::detectCores()/2
)
## S3 method for class 'word_proximity'
weight(x, type = "scale", ...)
Arguments
text.var |
The text variable. |
terms |
A vector of quoted terms. |
grouping.var |
The grouping variables. Default |
parallel |
logical. If |
cores |
The number of cores to use if |
x |
An object to be weighted. |
type |
A weighting type of: c( |
... |
ignored. |
Details
Note that row names are the first word and column names are the
second comparison word. The values for Word A compared to Word B will not
be the same as Word B compared to Word A. This is because, unlike a true
distance measure, word_proximity
's matrix is asymmetrical.
word_proximity
computes the distance by taking each sentence position
for Word A and comparing it to the nearest sentence location for Word B.
Value
Returns a list of matrices of proximity measures in the unit of average sentences between words (defaults to scaled).
Note
The match.terms is character sensitive. Spacing is an important way to grab specific words and requires careful thought. Using "read" will find the words "bread", "read" "reading", and "ready". If you want to search for just the word "read" you'd supply a vector of c(" read ", " reads", " reading", " reader").
See Also
Examples
## Not run:
wrds <- word_list(pres_debates2012$dialogue,
stopwords = c("it's", "that's", Top200Words))
wrds2 <- tolower(sort(wrds$rfswl[[1]][, 1]))
(x <- with(pres_debates2012, word_proximity(dialogue, wrds2)))
plot(x)
plot(weight(x))
plot(weight(x, "rev_scale_log"))
(x2 <- with(pres_debates2012, word_proximity(dialogue, wrds2, person)))
## The spaces around `terms` are important
(x3 <- with(DATA, word_proximity(state, spaste(qcv(the, i)))))
(x4 <- with(DATA, word_proximity(state, qcv(the, i))))
## End(Not run)