tardis_multidict {tardis}R Documentation

Analyze text with more than one dictionary

Description

This convenience function takes a text and a set of dictionaries, and calls tardis::tardis() once for each dictionary. Other parameters are also passed along to tardis().

Usage

tardis_multidict(input_text, text_column = NA, dictionaries, ...)

Arguments

input_text

A text to be analyzed, either a tbl_df or a character vector.

text_column

If tbl_df input, a character with the name of the input column containing the text to be analyzed.

dictionaries

A single tbl_df with columns dictionary, token, and (optionally, for weighted dictionaries) score.

...

Other parameters passed on to tardis::tardis().

Details

Dictionaries must be in a single tbl_df with at least two columns: token, containing the tokens belonging to each dictionary; and dicionary, which contains a unique identifier mapping each token to a dictionary. Weights, if present, must be in a column named score.

Tokens can be mapped to multiple dictionaries, but each row maps one token to one dictionary.

Value

A tbl_df with new columns for each dictionary.

Examples

## Not run: 
library(magrittr)
# Get NRC emotions dataset from textdata package
nrc_emotion <- textdata::lexicon_nrc() %>%
  dplyr::rename(token = word, dictionary = sentiment) %>%
  dplyr::mutate(score = 1)

# set up some input text
text <- dplyr::tibble(body = c("I am so angry!", "I am angry.",
  "I'm not angry.", "Your mother and I aren't angry, we're just disappointed."))

emotions <- tardis_multidict(input_text = text, text_column = "body",
  dictionaries = nrc_emotion) %>%
  dplyr::select(body, score_anger, score_sadness)

 emotions

## End(Not run)

[Package tardis version 0.1.4 Index]