R: Plot word keyness

textplot_keyness {quanteda.textplots}

R Documentation

Plot word keyness

Description

Plot the results of a "keyword" of features comparing their differential associations with a target and a reference group, after calculating keyness using quanteda.textstats::textstat_keyness().

Usage

textplot_keyness(
  x,
  show_reference = TRUE,
  show_legend = TRUE,
  n = 20L,
  min_count = 2L,
  margin = 0.05,
  color = c("darkblue", "gray"),
  labelcolor = "gray30",
  labelsize = 4,
  font = NULL
)

Arguments

`x`	a return object from `quanteda.textstats::textstat_keyness()`
`show_reference`	logical; if `TRUE`, show key reference features in addition to key target features
`show_legend`	logical; if `TRUE`, show legend
`n`	integer; number of features to plot
`min_count`	numeric; minimum total count of feature across the target and reference categories, for a feature to be included in the plot
`margin`	numeric; size of margin where feature labels are shown
`color`	character or integer; colours of bars for target and reference documents. `color` must have two elements when `show_reference = TRUE`. See ggplot2::color.
`labelcolor`	character; color of feature labels.
`labelsize`	numeric; size of feature labels and bars. See ggplot2::size.
`font`	character; font-family of texts. Use default font if `NULL`.

Value

a ggplot2 object

Author(s)

Haiyan Wang and Kohei Watanabe

Examples

## Not run: 
library("quanteda")
# compare Trump speeches to other Presidents by chi^2
dfmat1 <- data_corpus_inaugural |>
     corpus_subset(Year > 1980) |>
     tokens(remove_punct = TRUE) |>
     tokens_remove(stopwords("en")) |>
     dfm()
dfmat1 <- dfm_group(dfmat1, groups = dfmat1$President)
tstat1 <- quanteda.textstats::textstat_keyness(dfmat1, target = "Trump")
textplot_keyness(tstat1, margin = 0.2, n = 10)
tstat1 <- quanteda.textstats::textstat_keyness(dfmat1, target = "Trump")
textplot_keyness(tstat1, margin = 0.2, n = 10)

# compare contemporary Democrats v. Republicans
corp <- data_corpus_inaugural |>
    corpus_subset(Year > 1960)
corp$party <- ifelse(docvars(corp, "President") %in% c("Nixon", "Reagan", "Bush", "Trump"),
                     "Republican", "Democrat")
dfmat2 <- corp |>
    tokens(remove_punct = TRUE) |>
    tokens_remove(stopwords("en")) |>
    dfm()
tstat2 <- quanteda.textstats::textstat_keyness(dfm_group(dfmat2, groups = dfmat2$party),
                                               target = "Democrat", measure = "lr")
textplot_keyness(tstat2, color = c("blue", "red"), n = 10)

## End(Not run)

[Package quanteda.textplots version 0.94.4 Index]