plotTopicWord {tosca} | R Documentation |
Plotting Counts of Topics-Words-Combination over Time (Relative to Words)
Description
Creates a plot of the counts/proportion of specified combination of topics
and words. It is important to keep in mind that the baseline for
proportions are the sums of words, not sums of topics.
See also plotWordpt
.
There is an option to plot all curves in one plot or to create one plot for
every curve (see pages
). In addition the plots can be written to a pdf
by setting file
.
Usage
plotTopicWord(
object,
docs,
ldaresult,
ldaID,
wordlist = lda::top.topic.words(ldaresult$topics, 1),
link = c("and", "or"),
select = 1:nrow(ldaresult$document_sums),
tnames,
wnames,
rel = FALSE,
mark = TRUE,
unit = "month",
curves = c("exact", "smooth", "both"),
smooth = 0.05,
legend = ifelse(pages, "onlyLast:topright", "topright"),
pages = FALSE,
natozero = TRUE,
file,
main,
xlab,
ylab,
ylim,
both.lwd,
both.lty,
col,
...
)
Arguments
object |
|
docs |
Object as a result of |
ldaresult |
The result of a function call |
ldaID |
Character vector of IDs of the documents in
|
wordlist |
List of Ccharacter vectors. Every list element is an 'or'
link, every character string in a vector is linked by the argument
|
link |
Character: Should the (inner)
character vectors of each list element be linked by an |
select |
List of integer vectors: Which topics - linked by an "or" every time - should be take into account for plotting the word counts/proportion (default: all topics as simple integer vector)? |
tnames |
Character vector of same length as |
wnames |
Character vector of same length as |
rel |
Logical: Should counts
( |
mark |
Logical: Should years be marked by
vertical lines (default: |
unit |
Character: To which unit should dates be floored
(default: |
curves |
Character: Should |
smooth |
Numeric: Smoothing parameter
which is handed over to |
legend |
Character: Value(s) to specify the legend coordinates (default: |
pages |
Logical: Should all curves be
plotted in a single plot (default: |
natozero |
Logical: Should NAs be coerced
to zeros (default: |
file |
Character: File path if a pdf should be created |
main |
Character: Graphical parameter |
xlab |
Character: Graphical parameter |
ylab |
Character: Graphical parameter |
ylim |
Graphical parameter |
both.lwd |
Graphical parameter for smoothed values
if |
both.lty |
Graphical parameter for smoothed values
if |
col |
Graphical parameter, could be a vector. If |
... |
Additional graphical parameters |
Value
A plot.
Invisible: A dataframe with columns date
and tnames: wnames
with the counts/proportion of the selected combination of topics and words.
Examples
## Not run:
data(politics)
poliClean <- cleanTexts(politics)
words10 <- makeWordlist(text=poliClean$text)
words10 <- words10$words[words10$wordtable > 10]
poliLDA <- LDAprep(text=poliClean$text, vocab=words10)
LDAresult <- LDAgen(documents=poliLDA, K=10, vocab=words10)
# plot topwords from each topic
plotTopicWord(object=poliClean, docs=poliLDA, ldaresult=LDAresult, ldaID=names(poliLDA))
plotTopicWord(object=poliClean, docs=poliLDA, ldaresult=LDAresult, ldaID=names(poliLDA), rel=TRUE)
# plot one word in different topics
plotTopicWord(object=poliClean, docs=poliLDA, ldaresult=LDAresult, ldaID=names(poliLDA),
select=c(1,3,8), wordlist=c("bush"))
# Differences between plotTopicWord and plotWordpt
par(mfrow=c(2,2))
plotTopicWord(object=poliClean, docs=poliLDA, ldaresult=LDAresult, ldaID=names(poliLDA),
select=c(1,3,8), wordlist=c("bush"), rel=FALSE)
plotWordpt(object=poliClean, docs=poliLDA, ldaresult=LDAresult, ldaID=names(poliLDA),
select=c(1,3,8), wordlist=c("bush"), rel=FALSE)
plotTopicWord(object=poliClean, docs=poliLDA, ldaresult=LDAresult, ldaID=names(poliLDA),
select=c(1,3,8), wordlist=c("bush"), rel=TRUE)
plotWordpt(object=poliClean, docs=poliLDA, ldaresult=LDAresult, ldaID=names(poliLDA),
select=c(1,3,8), wordlist=c("bush"), rel=TRUE)
## End(Not run)