R: Subset sentiment measures

subset.sento_measures {sentometrics}

R Documentation

Subset sentiment measures

Description

Subsets rows of the sentiment measures based on its columns.

Usage

## S3 method for class 'sento_measures'
subset(x, subset = NULL, select = NULL, delete = NULL, ...)

Arguments

`x`	a `sento_measures` object created using `sento_measures`.
`subset`	a logical (non-`character`) expression indicating the rows to keep. If a `numeric` input is given, it is used for row index subsetting.
`select`	a `character` vector of the lexicon, feature and time weighting scheme names, to indicate which measures need to be selected, or as a `list` of `character` vectors, possibly with separately specified combinations (consisting of one unique lexicon, one unique feature, and one unique time weighting scheme at maximum).
`delete`	see the `select` argument, but to delete measures.
`...`	not used.

Value

A modified sento_measures object, with only the remaining rows and sentiment measures, including updated information and statistics, but the original sentiment scores data.table untouched.

Author(s)

Samuel Borms

Examples

data("usnews", package = "sentometrics")
data("list_lexicons", package = "sentometrics")
data("list_valence_shifters", package = "sentometrics")

# construct a sento_measures object to start with
corpus <- sento_corpus(corpusdf = usnews)
corpusSample <- quanteda::corpus_sample(corpus, size = 500)
l <- sento_lexicons(list_lexicons[c("LM_en", "HENRY_en")])
ctr <- ctr_agg(howTime = c("equal_weight", "linear"), by = "year", lag = 3)
sm <- sento_measures(corpusSample, l, ctr)

# three specified indices in required list format
three <- as.list(
  stringi::stri_split(c("LM_en--economy--linear",
                        "HENRY_en--wsj--equal_weight",
                        "HENRY_en--wapo--equal_weight"),
                      regex = "--")
)

# different subsets
sub1 <- subset(sm, HENRY_en--economy--equal_weight >= 0.01)
sub2 <- subset(sm, date %in% get_dates(sm)[3:12])
sub3 <- subset(sm, 3:12)
sub4 <- subset(sm, 1:100) # warning

# different selections
sel1 <- subset(sm, select = "equal_weight")
sel2 <- subset(sm, select = c("equal_weight", "linear"))
sel3 <- subset(sm, select = c("linear", "LM_en"))
sel4 <- subset(sm, select = list(c("linear", "wsj"), c("linear", "economy")))
sel5 <- subset(sm, select = three)

# different deletions
del1 <- subset(sm, delete = "equal_weight")
del2 <- subset(sm, delete = c("linear", "LM_en"))
del3 <- subset(sm, delete = list(c("linear", "wsj"), c("linear", "economy")))
del4 <- subset(sm, delete = c("equal_weight", "linear")) # warning
del5 <- subset(sm, delete = three)

[Package sentometrics version 1.0.0 Index]