ctr_agg {sentometrics}R Documentation

Set up control for aggregation into sentiment measures

Description

Sets up control object for (computation of textual sentiment and) aggregation into textual sentiment measures.

Usage

ctr_agg(
  howWithin = "proportional",
  howDocs = "equal_weight",
  howTime = "equal_weight",
  do.sentence = FALSE,
  do.ignoreZeros = TRUE,
  by = "day",
  lag = 1,
  fill = "zero",
  alphaExpDocs = 0.1,
  alphasExp = seq(0.1, 0.5, by = 0.1),
  do.inverseExp = FALSE,
  ordersAlm = 1:3,
  do.inverseAlm = TRUE,
  aBeta = 1:4,
  bBeta = 1:4,
  weights = NULL,
  tokens = NULL,
  nCore = 1
)

Arguments

howWithin

a single character vector defining how to perform aggregation within documents or sentences. Coincides with the how argument in the compute_sentiment function. Should length(howWithin) > 1, the first element is used. For available options see get_hows()$words.

howDocs

a single character vector defining how aggregation across documents (and/or sentences) per date will be performed. Should length(howDocs) > 1, the first element is used. For available options see get_hows()$docs.

howTime

a character vector defining how aggregation across dates will be performed. More than one choice is possible. For available options see get_hows()$time.

do.sentence

see compute_sentiment.

do.ignoreZeros

a logical indicating whether zero sentiment values have to be ignored in the determination of the document (and/or sentence) weights while aggregating across documents (and/or sentences). By default do.ignoreZeros = TRUE, such that documents (and/or sentences) with a raw sentiment score of zero or for which a given feature indicator is equal to zero are considered irrelevant.

by

a single character vector, either "day", "week", "month" or "year", to indicate at what level the dates should be aggregated. Dates are displayed as the first day of the period, if applicable (e.g., "2017-03-01" for March 2017).

lag

a single integer vector, being the time lag to be specified for aggregation across time. By default equal to 1, meaning no aggregation across time; a time weighting scheme named "dummyTime" is used in this case.

fill

a single character vector, one of c("zero", "latest", "none"), to control how missing sentiment values across the continuum of dates considered are added. This impacts the aggregation across time, applying the measures_fill function before aggregating, except if fill = "none". By default equal to "zero", which sets the scores (and thus also the weights) of the added dates to zero in the time aggregation.

alphaExpDocs

a single integer vector. A weighting smoothing factor, used if
"exponential" %in% howDocs or "inverseExponential" %in% howDocs. Value should be between 0 and 1 (both excluded); see weights_exponential.

alphasExp

a numeric vector of all exponential weighting smoothing factors, used if
"exponential" %in% howTime. Values should be between 0 and 1 (both excluded); see weights_exponential.

do.inverseExp

a logical indicating if for every exponential curve its inverse has to be added, used if "exponential" %in% howTime; see weights_exponential.

ordersAlm

a numeric vector of all Almon polynomial orders (positive) to calculate weights for, used if "almon" %in% howTime; see weights_almon.

do.inverseAlm

a logical indicating if for every Almon polynomial its inverse has to be added, used if "almon" %in% howTime; see weights_almon.

aBeta

a numeric vector of positive values as first Beta weighting decay parameter; see weights_beta.

bBeta

a numeric vector of positive values as second Beta weighting decay parameter; see weights_beta.

weights

optional own weighting scheme(s), used if provided as a data.frame with the number of rows equal to the desired lag.

tokens

see compute_sentiment.

nCore

see compute_sentiment.

Details

For available options on how aggregation can occur (via the howWithin, howDocs and howTime arguments), inspect get_hows. The control parameters associated to howDocs are used both for aggregation across documents and across sentences.

Value

A list encapsulating the control parameters.

Author(s)

Samuel Borms, Keven Bluteau

See Also

measures_fill, almons, compute_sentiment

Examples

set.seed(505)

# simple control function
ctr1 <- ctr_agg(howTime = "linear", by = "year", lag = 3)

# more elaborate control function (particular attention to time weighting schemes)
ctr2 <- ctr_agg(howWithin = "proportionalPol",
                howDocs = "exponential",
                howTime = c("equal_weight", "linear", "almon", "beta", "exponential", "own"),
                do.ignoreZeros = TRUE,
                by = "day",
                lag = 20,
                ordersAlm = 1:3,
                do.inverseAlm = TRUE,
                alphasExp = c(0.20, 0.50, 0.70, 0.95),
                aBeta = c(1, 3),
                bBeta = c(1, 3, 4, 7),
                weights = data.frame(myWeights = runif(20)),
                alphaExp = 0.3)

# set up control function with one linear and two chosen Almon weighting schemes
a <- weights_almon(n = 70, orders = 1:3, do.inverse = TRUE, do.normalize = TRUE)
ctr3 <- ctr_agg(howTime = c("linear", "own"), by = "year", lag = 70,
                weights = data.frame(a1 = a[, 1], a2 = a[, 3]),
                do.sentence = TRUE)


[Package sentometrics version 1.0.0 Index]