plotFreq {tosca} | R Documentation |
Plotting Counts of specified Wordgroups over Time (relative to Corpus)
Description
Creates a plot of the counts/proportion of given wordgroups (wordlist
)
in the subcorpus. The counts/proportion can be calculated on document or word
level - with an 'and' or 'or' link - and additionally can be normalised by
a subcorporus, which could be specified by id
.
Usage
plotFreq(
object,
id = names(object$text),
type = c("docs", "words"),
wordlist,
link = c("and", "or"),
wnames,
ignore.case = FALSE,
rel = FALSE,
mark = TRUE,
unit = "month",
curves = c("exact", "smooth", "both"),
smooth = 0.05,
both.lwd,
both.lty,
main,
xlab,
ylab,
ylim,
col,
legend = "topright",
natozero = TRUE,
file,
...
)
Arguments
object |
textmeta object with strictly tokenized
text component (character vectors) - like a result of
cleanTexts
|
id |
character vector (default: object$meta$id ) which IDs
specify the subcorpus
|
type |
character (default: "docs" ) should counts/proportion
of documents, where every "docs" or words "words" be plotted
|
wordlist |
list of character vectors. Every list element is an 'or'
link, every character string in a vector is linked by the argument
link . If wordlist is only a character vector it will be
coerced to a list of the same length as the vector (see as.list ),
so that the argument link has no effect. Each character vector
as a list element represents one curve in the outcoming plot
|
link |
character (default: "and" ) should the (inner)
character vectors of each list element be linked by an "and"
or an "or"
|
wnames |
character vector of same length as wordlist
- labels for every group of 'and' linked words
|
ignore.case |
logical (default: FALSE ) option
from grepl .
|
rel |
logical (default: FALSE ) should counts
(FALSE ) or proportion (TRUE ) be plotted
|
mark |
logical (default: TRUE ) should years be marked by
vertical lines
|
unit |
character (default: "month" ) to which unit should
dates be floored. Other possible units are "bimonth" , "quarter" , "season" ,
"halfyear" , "year" , for more units see round_date
|
curves |
character (default: "exact" ) should "exact" ,
"smooth" curve or "both" be plotted
|
smooth |
numeric (default: 0.05 ) smoothing parameter
which is handed over to lowess as f
|
both.lwd |
graphical parameter for smoothed values
if curves = "both"
|
both.lty |
graphical parameter for smoothed values
if curves = "both"
|
main |
character graphical parameter
|
xlab |
character graphical parameter
|
ylab |
character graphical parameter
|
ylim |
(default if rel = TRUE : c(0, 1) ) graphical parameter
|
col |
graphical parameter, could be a vector. If curves = "both"
the function will for every wordgroup plot at first the exact and then the
smoothed curve - this is important for your col order.
|
legend |
character (default: "topright") value(s) to specify the
legend coordinates. If "none" no legend is plotted.
|
natozero |
logical (default: TRUE ) should NAs be coerced
to zeros. Only has effect if rel = TRUE .
|
file |
character file path if a pdf should be created
|
... |
additional graphical parameters
|
Value
A plot.
Invisible: A dataframe with columns date
and wnames
- and
additionally columns wnames_rel
for rel = TRUE
- with the
counts (and proportion) of the given wordgroups.
Examples
## Not run:
data(politics)
poliClean <- cleanTexts(politics)
plotFreq(poliClean, wordlist=c("obama", "bush"))
## End(Not run)
[Package
tosca version 0.3-2
Index]