plot.STM {stm} | R Documentation |
Functions for plotting STM objects
Description
Produces one of four types of plots for an STM object. The default option
"summary"
prints topic words with their corpus frequency.
"labels"
is for easy printing of tables of indicative words for each
topic. "perspectives"
depicts differences between two topics,
content covariates or combinations. "hist"
creates a histogram of the
expected distribution of topic proportions across the documents.
Usage
## S3 method for class 'STM'
plot(
x,
type = c("summary", "labels", "perspectives", "hist"),
n = NULL,
topics = NULL,
labeltype = c("prob", "frex", "lift", "score"),
frexw = 0.5,
main = NULL,
xlim = NULL,
ylim = NULL,
xlab = NULL,
family = "",
width = 80,
covarlevels = NULL,
plabels = NULL,
text.cex = 1,
custom.labels = NULL,
topic.names = NULL,
...
)
Arguments
x |
Model output from stm. |
type |
Sets the desired type of plot. See details for more information. |
n |
Sets the number of words used to label each topic. In perspective
plots it approximately sets the total number of words in the plot. The
defaults are 3, 20 and 25 for |
topics |
Vector of topics to display. For plot perspectives this must be a vector of length one or two. For the other two types it defaults to all topics. |
labeltype |
Determines which option of |
frexw |
If "frex" labeltype is used, this will be the frex weight. |
main |
Title to the plot |
xlim |
Range of the X-axis. |
ylim |
Range of the Y-axis. |
xlab |
Labels for the X-axis. For perspective plots, use
|
family |
The Font family. Most of the time the user will not need to specify this but if using other character sets can be useful see par. |
width |
Sets the width in number of characters used for string wrapping
in type |
covarlevels |
A vector of length one or length two which contains the levels of the content covariate to be used in perspective plots. |
plabels |
This option can be used to override the default labels in the perspective plot that appear along the x-axis. It should be a character vector of length two which has the left hand side label first. |
text.cex |
Controls the scaling constant on text size. |
custom.labels |
A vector of custom labels if labeltype is equal to "custom". |
topic.names |
A vector of custom topic names. Defaults to "Topic #: ". |
... |
Additional parameters passed to plotting functions. |
Details
The function can produce three types of plots which summarize an STM object
which is chosen by the argument type
. summary
produces a plot
which displays the topics ordered by their expected frequency across the
corpus. labels
plots the top words selected according to the chosen
criteria for each selected topics. perspectives
plots two topic or
topic-covariate combinations. Words are sized proportional to their use
within the plotted topic-covariate combinations and oriented along the
X-axis based on how much they favor one of the two configurations. If the
words cluster on top of each other the user can either set the plot size to
be larger or shrink the total number of words on the plot. The vertical
configuration of the words is random and thus can be rerun to produce
different results each time. Note that perspectives
plots do
not use any of the labeling options directly. hist
plots a histogram of the MAP
estimates of the document-topic loadings across all documents. The median
is also denoted by a dashed red line.
References
Roberts, Margaret E., Brandon M. Stewart, Dustin Tingley, Christopher Lucas, Jetson Leder-Luis, Shana Kushner Gadarian, Bethany Albertson, and David G. Rand. "Structural Topic Models for Open-Ended Survey Responses." American Journal of Political Science 58, no 4 (2014): 1064-1082.
See Also
Examples
#Examples with the Gadarian Data
plot(gadarianFit)
plot(gadarianFit,type="labels")
plot(gadarianFit, type="perspectives", topics=c(1,2))
plot(gadarianFit,type="hist")