opi_score {opitools}R Documentation

Opinion score of a digital text document (DTD)


Given a DTD, this function computes the overall opinion score based on the proportion of text records classified as expressing positive, negative or a neutral sentiment. The function first transforms the text document into a tidy-format dataframe, described as the ⁠observed sentiment document (OSD)⁠ (Adepeju and Jimoh, 2021), in which each text record is assigned a sentiment class based on the summation of all sentiment scores expressed by the words in the text record.


opi_score(textdoc, metric = 1, fun = NULL)



An n x 1 list (dataframe) of individual text records, where n is the total number of individual records.


(an integer) Specify the metric to utilize for the calculation of opinion score. Valid values include 1, 2, ...,5. Assuming P, N and O represent positive, negative, and neutral record sentiments, respectively, the followings are the details of the opinion score function represented by the numerical arguments above: 1: Polarity (percentage difference) ((P - N)/(P + N))*100, (Bound: -100%, +100%); 2: Polarity (proportional difference) ((abs(P - N) / (P + N + O))*100, (Bound: 0, +100%); 3: Positivity (P/ (P + N + O))*100, (Bound: 0, +100%); 4: Negativity (N / (P + N + O))*100, (Bound: 0, +100%) (Malshe, A. 2019; Lowe et al. 2011). 5: To pass a user-defined opinion score function (also see the fun parameter below.


A user-defined function given that metric parameter (above) is set equal to 5. For example, given a defined opinion score function myfun <- ⁠function(P, N, O){⁠ ("some tasks to do"); ⁠return("a value")}⁠, the input argument of fun parameter then becomes fun = myfun. Default: NULL.


An opinion score is derived from all the sentiments (i.e. positive, negative (and neutral) expressed within a text document. We deploy a lexicon-based approach (Taboada et al. 2011) using the AFINN lexicon (Nielsen, 2011).


Returns an opi_object containing details of the opinion measures from the text document.


(1) Adepeju, M. and Jimoh, F. (2021). An Analytical Framework for Measuring Inequality in the Public Opinions on Policing – Assessing the impacts of COVID-19 Pandemic using Twitter Data. https://doi.org/10.31235/osf.io/c32qh (2) Malshe, A. (2019) Data Analytics Applications. Online book available at: https://ashgreat.github.io/analyticsAppBook/index.html. Date accessed: 15th December 2020. (3) Taboada, M.et al. (2011). Lexicon-based methods for sentiment analysis. Computational linguistics, 37(2), pp.267-307. (4) Lowe, W. et al. (2011). Scaling policy preferences from coded political texts. Legislative studies quarterly, 36(1), pp.123-155. (5) Razorfish (2009) Fluent: The Razorfish Social Influence Marketing Report. Accessed: 24th February, 2021. (6) Nielsen, F. A. (2011), “A new ANEW: Evaluation of a word list for sentiment analysis in microblogs”, Proceedings of the ESWC2011 Workshop on 'Making Sense of Microposts': Big things come in small packages (2011) 93-98.


# Use police/pandemic posts on Twitter
# Experiment with a standard metric (e.g. metric 1)
score <- opi_score(textdoc = policing_dtd, metric = 1, fun = NULL)
#print result

#Example using a user-defined opinion score -
#a demonstration with a component of SIM opinion
#Score function (by Razorfish, 2009). The opinion
#function can be expressed as:

myfun <- function(P, N, O){
  score <- (P + O - N)/(P + O + N)

#Run analysis
score <- opi_score(textdoc = policing_dtd, metric = 5, fun = myfun)
#print results

[Package opitools version 1.8.0 Index]