plotWordSub {tosca} | R Documentation |
Plotting Counts/Proportion of Words/Docs in LDA-generated Topic-Subcorpora over Time
Description
Creates a plot of the counts/proportion of words/docs in corpora which are
generated by a ldaresult
. Therefore an article is allocated to a topic
- and then to the topics corpus - if there are enough (see limit
and
alloc
) allocations of words in the article to the corresponding topic.
Additionally the corpora are reduced by filterWord
and a
search
-argument. The plot shows counts of subcorpora or if
rel = TRUE
proportion of subcorpora to its corresponding whole corpus.
Usage
plotWordSub(
object,
ldaresult,
ldaID,
limit = 10,
alloc = c("multi", "unique", "best"),
select = 1:nrow(ldaresult$document_sums),
tnames,
search,
ignore.case = TRUE,
type = c("docs", "words"),
rel = TRUE,
mark = TRUE,
unit = "month",
curves = c("exact", "smooth", "both"),
smooth = 0.05,
main,
xlab,
ylab,
ylim,
both.lwd,
both.lty,
col,
legend = "topright",
natozero = TRUE,
file,
...
)
Arguments
object |
|
ldaresult |
The result of a function call |
ldaID |
Character vector of IDs of the documents in
|
limit |
Integer/numeric: How often a word must be
allocated to a topic to count these article as belonging
to this topic - if |
alloc |
Character: Should every article
be allocated to multiple topics ( |
select |
Integer vector: Which topics of
|
tnames |
Character vector of same length as |
search |
See |
ignore.case |
See |
type |
Character: Should counts/proportion of documents, where every
|
rel |
Logical. Should counts ( |
mark |
Logical: Should years be marked by
vertical lines (default: |
unit |
Character: To which unit should dates be floored
(default: |
curves |
Character: Should |
smooth |
Numeric: Smoothing parameter
which is handed over to |
main |
Character: Graphical parameter |
xlab |
Character: Graphical parameter |
ylab |
Character: Graphical parameter |
ylim |
Graphical parameter (default if |
both.lwd |
Graphical parameter for smoothed values
if |
both.lty |
Graphical parameter for smoothed values
if |
col |
Graphical parameter, could be a vector. If |
legend |
Character: Value(s) to specify the legend coordinates (default: "topright"). If "none" no legend is plotted. |
natozero |
Logical. Should NAs be coerced
to zeros (default: |
file |
Character: File path if a pdf should be created |
... |
Additional graphical parameters |
Value
A plot.
Invisible: A dataframe with columns date
and tnames
with the
counts/proportion of the selected topics.
Examples
## Not run:
data(politics)
poliClean <- cleanTexts(politics)
poliPraesidents <- filterWord(object=poliClean, search=c("bush", "obama"))
words10 <- makeWordlist(text=poliPraesidents$text)
words10 <- words10$words[words10$wordtable > 10]
poliLDA <- LDAprep(text=poliPraesidents$text, vocab=words10)
LDAresult <- LDAgen(documents=poliLDA, K=5, vocab=words10)
plotWordSub(object=poliClean, ldaresult=LDAresult, ldaID=names(poliLDA), search="obama")
## End(Not run)