validate_sentiment {sentimentr} | R Documentation |
Validate Sentiment Score Sign Against Known Results
Description
Provides a multiclass macroaverage/microaverage of precision, recall, accuracy, and F-score for the sign of the predicted sentiment against known sentiment scores. There are three classes sentiment analysis generally predicts: positive (> 0), negative (< 0) and neutral (= 0). In assessing model performance one can use macro- or micro- averaging across classes. Macroaveraging allows every class to have an equal say. Microaveraging gives larger say to larger classes.
Usage
validate_sentiment(predicted, actual, ...)
Arguments
predicted |
A numeric vector of predicted sentiment scores or a sentimentr object that returns sentiment scores. |
actual |
A numeric vector of known sentiment ratings. |
... |
ignored. |
Value
Returns a data.frame
with a macroaveraged and
microaveraged model validation scores. Additionally, the
data.frame
has the following attributes:
confusion_matrix |
A confusion matrix of all classes |
class_confusion_matrices |
A |
macro_stats |
A |
mda |
Mean Directional Accuracy |
mare |
Mean Absolute Rescaled Error |
Note
Mean Absolute Rescaled Error (MARE) is defined as:
\frac{\sum{|actual - predicted|}}{2n}
and gives a sense of, on average,
how far off were the rescaled predicted values (-1 to 1) from the rescaled
actual values (-1 to 1). A value of 0 means perfect accuracy. A value of
1 means perfectly wrong every time. A value of .5 represents expected value
for random guessing. This measure is related to
Mean Absolute Error.
References
https://www.youtube.com/watch?v=OwwdYHWRB5E&index=31&list=PL6397E4B26D00A269
https://en.wikipedia.org/wiki/Mean_Directional_Accuracy_(MDA)
Examples
actual <- c(1, 1, 1, 1, -1, -1, -1, -1, -1, -1, -1, 1,-1)
predicted <- c(1, 0, 1, -1, 1, 0, -1, -1, -1, -1, 0, 1,-1)
validate_sentiment(predicted, actual)
scores <- hu_liu_cannon_reviews$sentiment
mod <- sentiment_by(get_sentences(hu_liu_cannon_reviews$text))
validate_sentiment(mod$ave_sentiment, scores)
validate_sentiment(mod, scores)
x <- validate_sentiment(mod, scores)
attributes(x)$confusion_matrix
attributes(x)$class_confusion_matrices
attributes(x)$macro_stats
## Annie Swafford Example
swafford <- data.frame(
text = c(
"I haven't been sad in a long time.",
"I am extremely happy today.",
"It's a good day.",
"But suddenly I'm only a little bit happy.",
"Then I'm not happy at all.",
"In fact, I am now the least happy person on the planet.",
"There is no happiness left in me.",
"Wait, it's returned!",
"I don't feel so bad after all!"
),
actual = c(.8, 1, .8, -.1, -.5, -1, -1, .5, .6),
stringsAsFactors = FALSE
)
pred <- sentiment_by(swafford$text)
validate_sentiment(
pred,
actual = swafford$actual
)