ggram {ngramr} | R Documentation |
Plot n-gram frequencies
Description
ggram
downloads data from the Google Ngram Viewer website and
plots it in ggplot2
style.
Usage
ggram(
phrases,
ignore_case = FALSE,
code_corpus = FALSE,
geom = "line",
geom_options = list(),
lab = NA,
google_theme = FALSE,
...
)
Arguments
phrases |
vector of phrases. Alternatively, phrases can be an ngram
object returned by |
ignore_case |
logical, indicating whether the frequencies are case
insensitive.
Default is |
code_corpus |
logical, indicating whether to use abbreviated corpus
'codes or longer form descriptions. Default is |
geom |
the ggplot2 geom used to plot the data; defaults to "line" |
geom_options |
list of additional parameters passed to the ggplot2 geom. |
lab |
y-axis label. Defaults to "Frequency". |
google_theme |
use a Google Ngram-style plot theme. |
... |
additional parameters passed to |
Details
Google generated two datasets drawn from digitised books in the Google books collection. One was generated in July 2009, the second in July 2012. Google will update these datasets as book scanning continues.
Examples
library(ggplot2)
ggram(c("hacker", "programmer"), year_start = 1950)
# Changing the geom.
ggram(c("cancer", "fumer", "cigarette"),
year_start = 1900,
corpus = "fr-2012",
smoothing = 0,
geom = "step")
# Passing more options.
ggram(c("cancer", "smoking", "tobacco"),
year_start = 1900,
corpus = "en-fiction-2012",
geom = "point",
smoothing = 0,
geom_options = list(alpha = .5)) +
stat_smooth(method="loess", se = FALSE, formula = y ~ x)
# Setting the layers manually.
ggram(c("cancer", "smoking", "tobacco"),
year_start = 1900,
corpus = "en-fiction-2012",
smoothing = 0,
geom = NULL) +
stat_smooth(method="loess", se=FALSE, span = 0.3, formula = y ~ x)
# Setting the legend placement on a long query and using the Google theme.
# Example taken from a post by Ben Zimmer at Language Log.
p <- c("((The United States is + The United States has) / The United States)",
"((The United States are + The United States have) / The United States)")
ggram(p, year_start = 1800, google_theme = TRUE) +
theme(legend.direction="vertical")
# Pass ngram data rather than phrases
ggram(hacker) + facet_wrap(~ Corpus)