R: Newswire segmenter

newswire_segmenter {disclosuR}

R Documentation

Newswire segmenter

Description

Takes a PDF document containing a 'newswire' document obtained from 'NexisUni' and transforms it into an R data frame consisting of one row

Usage

newswire_segmenter(
  file,
  sentiment = FALSE,
  emotion = FALSE,
  regulatory_focus = FALSE,
  laughter = FALSE,
  narcissism = FALSE,
  text_clustering = FALSE
)

Arguments

`file`	The name of the PDF file which the data are to be read from. If it does not contain an absolute path, the file name is relative to the current working directory, getwd().
`sentiment`	Performs dictionary-based sentiment analysis based on the `analyzeSentiment` function (default: FALSE)
`emotion`	Performs dictionary-based emotion analysis based on the `get_nrc_sentiment` function (default: FALSE)
`regulatory_focus`	Calculates the number of words indicative for promotion and prevention focus based on the dictionary developed by Gamache et al., 2015 (default: FALSE)
`laughter`	Counts the number of times laughter was indicated in a quote. (default: FALSE)
`narcissism`	Counts the number of pronoun usage and calculates the ratio of first-person singular to first-person plural pronouns. This measure is derived from Zhu & Chen, (2015 (default: FALSE)
`text_clustering`	Applies a document categorization using a dictionary developed based on the framework developed by Graffin et al., 2016. (default: FALSE)

Value

An R data frame with each row representing one 'newswire' article. The columns indicate the title, text, 'newswire', date, and weekday.

Examples

newswire_df <- newswire_segmenter(
file = system.file("inst",
"examples",
"newswire", "newswire_example_01.pdf",
package = "disclosuR"));
newswire_df_sentiment <- newswire_segmenter(
file = system.file("inst",
"examples",
"newswire", "newswire_example_01.pdf",
sentiment = TRUE,
package = "disclosuR"));

[Package disclosuR version 0.6.0 Index]