ci {RKorAPClient} | R Documentation |
Add confidence interval and relative frequency variables
Description
Using prop.test()
, ci
adds three columns to a data frame:
relative frequency (
f
)lower bound of a confidence interval (
ci.low
)upper bound of a confidence interval
Convenience function for converting frequency tables to instances per million.
Convenience function for converting frequency tables of alternative variants
(generated with as.alternatives=TRUE
) to percent.
Converts a vector of query or vc strings to typically appropriate legend labels by clipping off prefixes and suffixes that are common to all query strings.
Experimental convenience function for plotting typical frequency by year graphs with confidence intervals using ggplot2. Warning: This function may be moved to a new package.
Usage
ci(df, x = totalResults, N = total, conf.level = 0.95)
ipm(df)
percent(df)
queryStringToLabel(data, pubDateOnly = FALSE, excludePubDate = FALSE)
geom_freq_by_year_ci(mapping = aes(ymin = conf.low, ymax = conf.high), ...)
Arguments
df |
table returned from |
x |
column with the observed absolute frequency. |
N |
column with the total frequencies |
conf.level |
confidence level of the returned confidence interval. Must be a single number between 0 and 1. |
data |
string or vector of query or vc definition strings |
pubDateOnly |
discard all but the publication date |
excludePubDate |
discard publication date constraints |
mapping |
Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE (the default), it is combined with the default mapping at the top level of the plot. You must supply mapping if there is no plot mapping. |
... |
Other arguments passed to geom_ribbon, geom_line, and geom_click_point. |
Details
Given a table with columns f
, conf.low
, and conf.high
, ipm
ads a column ipm
und multiplies conf.low and conf.high
with 10^6.
Value
original table with additional column ipm
and converted columns conf.low
and conf.high
original table with converted columns f
, conf.low
and conf.high
string or vector of strings with clipped off common prefixes and suffixes
See Also
ci
is already included in frequencyQuery()
Examples
## Not run:
library(ggplot2)
kco <- new("KorAPConnection", verbose=TRUE)
expand_grid(year=2015:2018, alternatives=c("Hate Speech", "Hatespeech")) %>%
bind_cols(corpusQuery(kco, .$alternatives, sprintf("pubDate in %d", .$year))) %>%
mutate(total=corpusStats(kco, vc=vc)$tokens) %>%
ci() %>%
ggplot(aes(x=year, y=f, fill=query, color=query, ymin=conf.low, ymax=conf.high)) +
geom_point() + geom_line() + geom_ribbon(alpha=.3)
## End(Not run)
## Not run:
new("KorAPConnection") %>% frequencyQuery("Test", paste0("pubDate in ", 2000:2002)) %>% ipm()
## End(Not run)
## Not run:
new("KorAPConnection") %>%
frequencyQuery(c("Tollpatsch", "Tolpatsch"),
vc=paste0("pubDate in ", 2000:2002),
as.alternatives = TRUE) %>%
percent()
## End(Not run)
queryStringToLabel(paste("textType = /Zeit.*/ & pubDate in", c(2010:2019)))
queryStringToLabel(c("[marmot/m=mood:subj]", "[marmot/m=mood:ind]"))
queryStringToLabel(c("wegen dem [tt/p=NN]", "wegen des [tt/p=NN]"))
## Not run:
library(ggplot2)
kco <- new("KorAPConnection", verbose=TRUE)
expand_grid(condition = c("textDomain = /Wirtschaft.*/", "textDomain != /Wirtschaft.*/"),
year = (2005:2011)) %>%
cbind(frequencyQuery(kco, "[tt/l=Heuschrecke]",
paste0(.$condition," & pubDate in ", .$year))) %>%
ipm() %>%
ggplot(aes(year, ipm, fill = condition, color = condition)) +
geom_freq_by_year_ci()
## End(Not run)