R: Plotting Topics over Time relative to Corpus

plotHeat {tosca}

R Documentation

Plotting Topics over Time relative to Corpus

Description

Creates a pdf showing a heat map. For each topic, the heat map shows the deviation of its current share from its mean share. Shares can be calculated on corpus level or on subcorpus level concerning LDA vocabulary. Shares can be calculated in absolute deviation from the mean or relative to the mean of the topic to account for different topic strengths.

Usage

plotHeat(
  object,
  ldaresult,
  ldaID,
  select = 1:nrow(ldaresult$document_sums),
  tnames,
  norm = FALSE,
  file,
  unit = "year",
  date_breaks = 1,
  margins = c(5, 0),
  ...
)

Arguments

`object`	`textmeta` object with strictly tokenized `text` component (calculation of proportion on document lengths) or `textmeta` object which contains only the `meta` component (calculation of proportion on count of words out of the LDA vocabulary in each document)
`ldaresult`	LDA result object.
`ldaID`	Character vector containing IDs of the texts.
`select`	Numeric vector containing the numbers of the topics to be plotted. Defaults to all topics.
`tnames`	Character vector with labels for the topics.
`norm`	Logical: Should the values be normalized by the mean topic share to account for differently sized topics (default: `FALSE`)?
`file`	Character vector containing the path and name for the pdf output file.
`unit`	Character: To which unit should dates be floored (default: `"year"`)? Other possible units are `"bimonth"`, `"quarter"`, `"season"`, `"halfyear"`, `"year"`, for more units see `round_date`
`date_breaks`	How many labels should be shown on the x axis (default: `1`)? If `data_breaks` is `5` every fifth label is drawn.
`margins`	See `heatmap`
`...`	Additional graphical parameters passed to `heatmap`, for example `distfun` or `hclustfun`. details The function is useful to search for peaks in the coverage of topics.

Value

A pdf. Invisible: A dataframe.

Examples

## Not run: 
data(politics)
poliClean <- cleanTexts(politics)
words10 <- makeWordlist(text=poliClean$text)
words10 <- words10$words[words10$wordtable > 10]
poliLDA <- LDAprep(text=poliClean$text, vocab=words10)
LDAresult <- LDAgen(documents=poliLDA, K=10, vocab=words10)
plotHeat(object=poliClean, ldaresult=LDAresult, ldaID=names(poliLDA))

## End(Not run)

[Package tosca version 0.3-2 Index]