nlp_scores {transforEmotion} | R Documentation |
Natural Language Processing Scores
Description
Natural Language Processing using word embeddings to compute
semantic similarities (cosine; see
costring
) of text and specified classes
Usage
nlp_scores(
text,
classes,
semantic_space = c("baroni", "cbow", "cbow_ukwac", "en100", "glove", "tasa"),
preprocess = TRUE,
remove_stop = TRUE,
keep_in_env = TRUE,
envir = 1
)
Arguments
text |
Character vector or list. Text in a vector or list data format |
classes |
Character vector. Classes to score the text |
semantic_space |
Character vector. The semantic space used to compute the distances between words (more than one allowed). Here's a list of the semantic spaces:
|
preprocess |
Boolean.
Should basic preprocessing be applied?
Includes making lowercase, keeping only alphanumeric characters,
removing escape characters, removing repeated characters,
and removing white space.
Defaults to |
remove_stop |
Boolean.
Should |
keep_in_env |
Boolean.
Whether the classifier should be kept in your global environment.
Defaults to |
envir |
Numeric. Environment for the classifier to be saved for repeated use. Defaults to the global environment |
Value
Returns semantic distances for the text classes
Author(s)
Alexander P. Christensen <alexpaulchristensen@gmail.com>
References
Baroni, M., Dinu, G., & Kruszewski, G. (2014). Don't count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the 52nd annual meting of the association for computational linguistics (pp. 238-247).
Landauer, T.K., & Dumais, S.T. (1997). A solution to Plato's problem: The Latent Semantic Analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104, 211-240.
Pennington, J., Socher, R., & Manning, C. D. (2014). GloVe: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (pp. 1532-1543).
Examples
# Load data
data(neo_ipip_extraversion)
# Example text
text <- neo_ipip_extraversion$friendliness[1:5]
## Not run:
# GloVe
nlp_scores(
text = text,
classes = c(
"friendly", "gregarious", "assertive",
"active", "excitement", "cheerful"
)
)
# Baroni
nlp_scores(
text = text,
classes = c(
"friendly", "gregarious", "assertive",
"active", "excitement", "cheerful"
),
semantic_space = "baroni"
)
# CBOW
nlp_scores(
text = text,
classes = c(
"friendly", "gregarious", "assertive",
"active", "excitement", "cheerful"
),
semantic_space = "cbow"
)
# CBOW + ukWaC
nlp_scores(
text = text,
classes = c(
"friendly", "gregarious", "assertive",
"active", "excitement", "cheerful"
),
semantic_space = "cbow_ukwac"
)
# en100
nlp_scores(
text = text,
classes = c(
"friendly", "gregarious", "assertive",
"active", "excitement", "cheerful"
),
semantic_space = "en100"
)
# tasa
nlp_scores(
text = text,
classes = c(
"friendly", "gregarious", "assertive",
"active", "excitement", "cheerful"
),
semantic_space = "tasa"
)
## End(Not run)