| opi_sim {opitools} | R Documentation |
Simulates the opinion expectation distribution of a digital text document.
Description
This function simulates the expectation distribution of the
observed opinion score (computed using the opi_score function).
The resulting tidy-format dataframe can be described as the
expected sentiment document (ESD) (Adepeju and Jimoh, 2021).
Usage
opi_sim(osd_data, nsim=99, metric = 1, fun = NULL, quiet=TRUE)
Arguments
osd_data |
A list (dataframe). An |
nsim |
(an integer) Number of replicas (ESD) to simulate.
Recommended values are: 99, 999, 9999, and so on. Since the run time
is proportional to the number of replicas, a moderate number of
simulation, such as 999, is recommended. Default: |
metric |
(an integer) Specify the metric to utilize for the
calculation of the opinion score. Default: |
fun |
A user-defined function given that parameter
|
quiet |
(TRUE or FALSE) To suppress processing
messages. Default: |
Details
Employs non-parametric randomization testing approach in order to generate the expectation distribution of the observed opinion scores (see details in Adepeju and Jimoh 2021).
Value
Returns a list of expected opinion scores with length equal
to the number of simulation (nsim) specified.
References
(1) Adepeju, M. and Jimoh, F. (2021). An Analytical Framework for Measuring Inequality in the Public Opinions on Policing – Assessing the impacts of COVID-19 Pandemic using Twitter Data. https://doi.org/10.31235/osf.io/c32qh
Examples
#Prepare an osd data from the output
#of `opi_score` function.
score <- opi_score(textdoc = policing_dtd,
metric = 1, fun = NULL)
#extract OSD
OSD <- score$OSD
#note that `OSD` is shorter in length
#than `policing_dtd`, meaning that some
#text records were not classified
#Bind a fictitious indicator column
osd_data2 <- data.frame(cbind(OSD,
keywords = sample(c("present","absent"), nrow(OSD),
replace=TRUE, c(0.35, 0.65))))
#generate expected distribution
exp_score <- opi_sim(osd_data2, nsim=99, metric = 1,
fun = NULL, quiet=TRUE)
#preview the distribution
hist(exp_score)