R: Dictionary Use

dictionaryStatistics {DramaAnalysis}

R Documentation

Dictionary Use

Description

These methods retrieve count the number of occurrences of the words in the dictionaries, across different speakers and/or segments. The function dictionaryStatistics() calculates statistics for dictionaries with multiple entries, dictionaryStatisticsSingle() only for a single word list.

Extract the number part from a QDDictionaryStatistics table as a matrix

Usage

dictionaryStatistics(
  drama,
  fields = DramaAnalysis::base_dictionary[fieldnames],
  fieldnames = c("Liebe"),
  segment = c("Drama", "Act", "Scene"),
  normalizeByCharacter = FALSE,
  normalizeByField = FALSE,
  byCharacter = TRUE,
  column = "Token.lemma",
  ci = TRUE
)

dictionaryStatisticsSingle(
  drama,
  wordfield = c(),
  segment = c("Drama", "Act", "Scene"),
  normalizeByCharacter = FALSE,
  normalizeByField = FALSE,
  byCharacter = TRUE,
  fieldNormalizer = length(wordfield),
  column = "Token.lemma",
  ci = TRUE,
  colnames = NULL
)

## S3 method for class 'QDDictionaryStatistics'
as.matrix(x, ...)

Arguments

`drama`	A QDDrama object.
`fields`	A list of lists that contains the actual field names. By default, we load the `base_dictionary`.
`fieldnames`	A list of names for the dictionaries.
`segment`	The segment level that should be used. By default, the entire play will be used. Possible values are "Drama" (default), "Act" or "Scene".
`normalizeByCharacter`	Logical. Whether to normalize by character speech length.
`normalizeByField`	Logical. Whether to normalize by dictionary size. You usually want this.
`byCharacter`	Logical, defaults to TRUE. If false, values will be calculated for the entire segment (play, act, or scene), and not for individual characters.
`column`	The table column we apply the dictionary on. Should be either "Token.surface" or "Token.lemma", the latter is the default.
`ci`	Whether to ignore case. Defaults to TRUE, i.e., case is ignored.
`wordfield`	A character vector containing the words or lemmas to be counted (only for `*Single`-functions)
`fieldNormalizer`	Defaults to the length of the wordfield. If normalizeByField is given, the absolute numbers are divided by this number.
`colnames`	The column names to be used in the output table.
`x`	An object of the type `QDDictionaryStatistics`, e.g., the output of `dictionaryStatistics`.
`...`	All other parameters are passed to `as.matrix.data.frame()`.

Value

A numeric matrix that contains the frequency with which a dictionary is present in a subset of tokens

Examples

# Check multiple dictionary entries
data(rksp.0)
dstat <- dictionaryStatistics(rksp.0, fieldnames=c("Krieg","Familie"))
# Check a single dictionary entries
data(rksp.0)
fstat <- dictionaryStatisticsSingle(rksp.0, wordfield=c("der"))
mat <- as.matrix(dictionaryStatistics(rksp.0, fieldnames=c("Krieg","Familie")))