sentiment_score {sentiment.ai} | R Documentation |
Simple Sentiment Scores
Description
This uses a simple model (xgboost or glm) to return a simple predictive score, where numbers closer to 1 are more positive and numbers closer to -1 are more negative. This can be used to determine whether the sentiment is positive or negative.
Usage
sentiment_score(
x = NULL,
model = names(default_models),
scoring = c("xgb", "glm"),
scoring_version = "1.0",
batch_size = 100,
...
)
Arguments
x |
A plain text vector or column name if data is supplied. If you know what you're doing, you can also pass in a 512-D numeric embedding. |
model |
An embedding name from tensorflow-hub, some of which are "en" (english large or not) and "multi" (multi-lingual large or not). |
scoring |
Model to use for scoring the embedding matrix (currently either "xgb" or "glm"). |
scoring_version |
The scoring version to use, currently only 1.0, but other versions might be supported in the future. |
batch_size |
Size of batches to use. Larger numbers will be faster than smaller numbers, but do not exhaust your system memory! |
... |
Additional arguments passed to |
Details
Uses simple preditive models on embeddings to provide probability of positive score (rescaled to -1:1 for consistency with other packages).
Value
numeric vector of length(x) containing a re-scaled sentiment probabilities.
Examples
## Not run:
envname <- "r-sentiment-ai"
# make sure to install sentiment ai (install_sentiment.ai)
# install_sentiment.ai(envname = envname,
# method = "conda")
# running the model
mod_xgb <- sentiment_score(x = airline_tweets$text,
model = "en.large",
scoring = "xgb",
envname = envname)
mod_glm <- sentiment_score(x = airline_tweets$text,
model = "en.large",
scoring = "glm",
envname = envname)
# checking performance
pos_neg <- factor(airline_tweets$airline_sentiment,
levels = c("negative", "neutral", "positive"))
pos_neg <- (as.numeric(pos_neg) - 1) / 2
cosine(mod_xgb, pos_neg)
cosine(mod_glm, pos_neg)
# you could also calculate accuracy/kappa
## End(Not run)