R: Plotting Counts of specified Wordgroups over Time (relative...

plotFreq {tosca}

R Documentation

Plotting Counts of specified Wordgroups over Time (relative to Corpus)

Description

Creates a plot of the counts/proportion of given wordgroups (wordlist) in the subcorpus. The counts/proportion can be calculated on document or word level - with an 'and' or 'or' link - and additionally can be normalised by a subcorporus, which could be specified by id.

Usage

plotFreq(
  object,
  id = names(object$text),
  type = c("docs", "words"),
  wordlist,
  link = c("and", "or"),
  wnames,
  ignore.case = FALSE,
  rel = FALSE,
  mark = TRUE,
  unit = "month",
  curves = c("exact", "smooth", "both"),
  smooth = 0.05,
  both.lwd,
  both.lty,
  main,
  xlab,
  ylab,
  ylim,
  col,
  legend = "topright",
  natozero = TRUE,
  file,
  ...
)

Arguments

`object`	`textmeta` object with strictly tokenized `text` component (`character` vectors) - like a result of `cleanTexts`
`id`	`character` vector (default: `object$meta$id`) which IDs specify the subcorpus
`type`	`character` (default: `"docs"`) should counts/proportion of documents, where every `"docs"` or words `"words"` be plotted
`wordlist`	list of `character` vectors. Every list element is an 'or' link, every `character` string in a vector is linked by the argument `link`. If `wordlist` is only a `character` vector it will be coerced to a list of the same length as the vector (see `as.list`), so that the argument `link` has no effect. Each `character` vector as a list element represents one curve in the outcoming plot
`link`	`character` (default: `"and"`) should the (inner) `character` vectors of each list element be linked by an `"and"` or an `"or"`
`wnames`	`character` vector of same length as `wordlist` - labels for every group of 'and' linked words
`ignore.case`	`logical` (default: `FALSE`) option from `grepl`.
`rel`	`logical` (default: `FALSE`) should counts (`FALSE`) or proportion (`TRUE`) be plotted
`mark`	`logical` (default: `TRUE`) should years be marked by vertical lines
`unit`	`character` (default: `"month"`) to which unit should dates be floored. Other possible units are `"bimonth"`, `"quarter"`, `"season"`, `"halfyear"`, `"year"`, for more units see `round_date`
`curves`	`character` (default: `"exact"`) should `"exact"`, `"smooth"` curve or `"both"` be plotted
`smooth`	`numeric` (default: `0.05`) smoothing parameter which is handed over to `lowess` as `f`
`both.lwd`	graphical parameter for smoothed values if `curves = "both"`
`both.lty`	graphical parameter for smoothed values if `curves = "both"`
`main`	`character` graphical parameter
`xlab`	`character` graphical parameter
`ylab`	`character` graphical parameter
`ylim`	(default if `rel = TRUE`: `c(0, 1)`) graphical parameter
`col`	graphical parameter, could be a vector. If `curves = "both"` the function will for every wordgroup plot at first the exact and then the smoothed curve - this is important for your col order.
`legend`	`character` (default: "topright") value(s) to specify the legend coordinates. If "none" no legend is plotted.
`natozero`	`logical` (default: `TRUE`) should NAs be coerced to zeros. Only has effect if `rel = TRUE`.
`file`	`character` file path if a pdf should be created
`...`	additional graphical parameters

Value

A plot. Invisible: A dataframe with columns date and wnames - and additionally columns wnames_rel for rel = TRUE - with the counts (and proportion) of the given wordgroups.

Examples

## Not run: 
data(politics)
poliClean <- cleanTexts(politics)
plotFreq(poliClean, wordlist=c("obama", "bush"))

## End(Not run)

[Package tosca version 0.3-2 Index]