newswire_segmenter {disclosuR} | R Documentation |
Newswire segmenter
Description
Takes a PDF document containing a 'newswire' document obtained from 'NexisUni' and transforms it into an R data frame consisting of one row
Usage
newswire_segmenter(
file,
sentiment = FALSE,
emotion = FALSE,
regulatory_focus = FALSE,
laughter = FALSE,
narcissism = FALSE,
text_clustering = FALSE
)
Arguments
file |
The name of the PDF file which the data are to be read from. If it does not contain an absolute path, the file name is relative to the current working directory, getwd(). |
sentiment |
Performs dictionary-based sentiment analysis
based on the |
emotion |
Performs dictionary-based emotion analysis based on the
|
regulatory_focus |
Calculates the number of words indicative for promotion and prevention focus based on the dictionary developed by Gamache et al., 2015 (default: FALSE) |
laughter |
Counts the number of times laughter was indicated in a quote. (default: FALSE) |
narcissism |
Counts the number of pronoun usage and calculates the ratio of first-person singular to first-person plural pronouns. This measure is derived from Zhu & Chen, (2015 (default: FALSE) |
text_clustering |
Applies a document categorization using a dictionary developed based on the framework developed by Graffin et al., 2016. (default: FALSE) |
Value
An R data frame with each row representing one 'newswire' article. The columns indicate the title, text, 'newswire', date, and weekday.
Examples
newswire_df <- newswire_segmenter(
file = system.file("inst",
"examples",
"newswire", "newswire_example_01.pdf",
package = "disclosuR"));
newswire_df_sentiment <- newswire_segmenter(
file = system.file("inst",
"examples",
"newswire", "newswire_example_01.pdf",
sentiment = TRUE,
package = "disclosuR"));