newswire_segmenter_folder {disclosuR}R Documentation

Newswire segmenter (multiple files)

Description

Takes all PDF documents in a folder containing 'newswire' documents obtained from 'NexisUni' and transforms them into an R data frame consisting of one row per document.

Usage

newswire_segmenter_folder(
  folder_path,
  sentiment = FALSE,
  emotion = FALSE,
  regulatory_focus = FALSE,
  laughter = FALSE,
  narcissism = FALSE,
  text_clustering = FALSE
)

Arguments

folder_path

The path to the folder in which the 'newswire' PDFs reside. If it does not contain an absolute path, the folder name is relative to the current working directory, getwd().

sentiment

Performs dictionary-based sentiment analysis based on the analyzeSentiment function (default: FALSE)

emotion

Performs dictionary-based emotion analysis based on the get_nrc_sentiment function (default: FALSE)

regulatory_focus

Calculates the number of words indicative for promotion and prevention focus based on the dictionary developed by Gamache et al., 2015 (default: FALSE)

laughter

Counts the number of times laughter was indicated in a quote. (default: FALSE)

narcissism

Counts the number of pronoun usage and calculates the ratio of first-person singular to first-person plural pronouns. This measure is derived from Zhu & Chen, (2015 (default: FALSE)

text_clustering

Applies a document categorization using a dictionary developed based on the framework developed by Graffin et al., 2016. (default: FALSE)

Value

An R data frame with each row representing one 'newswire' article. The columns indicate the title, text, 'newswire', date, and weekday. (default: FALSE)

An R data frame with each row representing one 'newswire' article. The columns indicate the title, text, 'newswire', date, and weekday. Depending on the additional arguments, the output data can also contain sentiment, emotion, regulatory focus, laughter, narcissism and text cluster based on the Graffin et al. categories.

Examples

newswire_df <- newswire_segmenter_folder(
folder_path = system.file("inst",
"examples",
"newswire",
package = "disclosuR"));
newswire_df_sentiment <- newswire_segmenter_folder(
folder_path = system.file("inst",
"examples",
"newswire",
package = "dislosuR"), sentiment = TRUE);


[Package disclosuR version 0.6.0 Index]