transformer_scores {transforEmotion} | R Documentation |
Sentiment Analysis Scores
Description
Uses sentiment analysis pipelines from huggingface to compute probabilities that the text corresponds to the specified classes
Usage
transformer_scores(
text,
classes,
multiple_classes = FALSE,
transformer = c("cross-encoder-roberta", "cross-encoder-distilroberta",
"facebook-bart"),
preprocess = FALSE,
keep_in_env = TRUE,
envir = 1
)
Arguments
text |
Character vector or list. Text in a vector or list data format |
classes |
Character vector. Classes to score the text |
multiple_classes |
Boolean.
Whether the text can belong to multiple true classes.
Defaults to |
transformer |
Character. Specific zero-shot sentiment analysis transformer to be used. Default options:
Defaults to Also allows any zero-shot classification models with a pipeline
from huggingface
to be used by using the specified name (e.g., |
preprocess |
Boolean.
Should basic preprocessing be applied?
Includes making lowercase, keeping only alphanumeric characters,
removing escape characters, removing repeated characters,
and removing white space.
Defaults to |
keep_in_env |
Boolean.
Whether the classifier should be kept in your global environment.
Defaults to |
envir |
Numeric. Environment for the classifier to be saved for repeated use. Defaults to the global environment |
Value
Returns probabilities for the text classes
Author(s)
Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
# BART
Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., ... & Zettlemoyer, L. (2019).
Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension.
arXiv preprint arXiv:1910.13461.
# RoBERTa
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., ... & Stoyanov, V. (2019).
Roberta: A robustly optimized bert pretraining approach.
arXiv preprint arXiv:1907.11692.
# Zero-shot classification
Yin, W., Hay, J., & Roth, D. (2019).
Benchmarking zero-shot text classification: Datasets, evaluation and entailment approach.
arXiv preprint arXiv:1909.00161.
# MultiNLI dataset
Williams, A., Nangia, N., & Bowman, S. R. (2017).
A broad-coverage challenge corpus for sentence understanding through inference.
arXiv preprint arXiv:1704.05426.
Examples
# Load data
data(neo_ipip_extraversion)
# Example text
text <- neo_ipip_extraversion$friendliness[1:5]
## Not run:
# Cross-Encoder DistilRoBERTa
transformer_scores(
text = text,
classes = c(
"friendly", "gregarious", "assertive",
"active", "excitement", "cheerful"
)
)
# Facebook BART Large
transformer_scores(
text = text,
classes = c(
"friendly", "gregarious", "assertive",
"active", "excitement", "cheerful"
),
transformer = "facebook-bart"
)
# Directly from huggingface: typeform/distilbert-base-uncased-mnli
transformer_scores(
text = text,
classes = c(
"friendly", "gregarious", "assertive",
"active", "excitement", "cheerful"
),
transformer = "typeform/distilbert-base-uncased-mnli"
)
## End(Not run)