s_attributes {polmineR}R Documentation

Get s-attributes.

Description

Structural annotations (s-attributes) of a corpus capture metainformation for regions of tokens. The s_attributes()-method offers high-level access to the s-attributes present in a corpus or subcorpus, or the values of s-attributes in a corpus/partition.

Usage

s_attributes(.Object, ...)

## S4 method for signature 'character'
s_attributes(.Object, s_attribute = NULL, unique = TRUE, regex = NULL, ...)

## S4 method for signature 'corpus'
s_attributes(.Object, s_attribute = NULL, unique = TRUE, regex = NULL, ...)

## S4 method for signature 'slice'
s_attributes(.Object, s_attribute = NULL, unique = TRUE, ...)

## S4 method for signature 'partition'
s_attributes(.Object, s_attribute = NULL, unique = TRUE, ...)

## S4 method for signature 'subcorpus'
s_attributes(.Object, s_attribute = NULL, unique = TRUE, ...)

## S4 method for signature 'context'
s_attributes(.Object, s_attribute = NULL)

## S4 method for signature 'partition_bundle'
s_attributes(.Object, s_attribute, unique = TRUE, ...)

## S4 method for signature 'call'
s_attributes(.Object, corpus)

## S4 method for signature 'quosure'
s_attributes(.Object, corpus)

## S4 method for signature 'name'
s_attributes(.Object, corpus)

## S4 method for signature 'remote_corpus'
s_attributes(.Object, ...)

## S4 method for signature 'remote_partition'
s_attributes(.Object, ...)

## S4 method for signature 'data.table'
s_attributes(.Object, corpus, s_attribute, registry)

Arguments

.Object

A corpus, subcorpus, partition object, or a call. A corpus can also be specified by a length-one character vector.

...

To maintain backward compatibility, if argument sAttribute (deprecated) is used. If .Object is a remote_corpus or remote_subcorpus object, the three dots (...) are used to pass arguments. Hence, it is necessary to state the names of all arguments to be passed explicity.

s_attribute

The name of a specific s-attribute.

unique

A logical value, whether to return unique values.

regex

A regular expression passed into grep to filter return value by applying a regex.

corpus

A corpus-object or a length one character vector denoting a corpus.

registry

The registry directory with the registry file defining corpus. If missing, the registry directory that can be derived using RcppCWB::corpus_registry_dir() is used.

Details

Importing XML into the Corpus Workbench (CWB) turns elements and element attributes into so-called "s-attributes". There are two basic uses of the s_attributes()-method: If the argument s_attribute is NULL (default), the return value is a character vector with all s-attributes present in a corpus.

If s_attribute denotes a specific s-attribute (a length-one character vector), the values of the s-attributes available in the corpus/partition are returned. if the s-attribute does not have values, NA is returned and a warning message is issued.

If argument unique is FALSE, the full sequence of the s_attributes is returned, which is a useful building block for decoding a corpus.

If argument s_attributes is a character providing several s-attributes, the method will return a data.table. If unique is TRUE, all unique combinations of the s-attributes will be reported by the data.table.

If the corpus is based on a nested XML structure, the order of items on the s_attribute vector matters. The method for corpus objects will take the first s-attribute as the benchmark and assume that further s-attributes are XML ancestors of the node.

If .Object is a context object, the s-attribute value for the first corpus position of every match is returned in a character vector. If the match is outside a region of the s-attribute, NA is returned.

If .Object is a call or a quosure (defined in the rlang package), the s_attributes-method will return a character vector with the s-attributes occurring in the call. This usage is relevant internally to implement the subset method to generate a subcorpus using non-standard evaluation. Usually it will not be relevant in an interactive session.

Value

A character vector (s-attributes, or values of s-attributes).

Examples

use("polmineR")

s_attributes("GERMAPARLMINI")
s_attributes("GERMAPARLMINI", "date") # dates of plenary meetings
s_attributes("GERMAPARLMINI", s_attribute = c("date", "party"))  
s_attributes(corpus("GERMAPARLMINI"))
p <- partition("GERMAPARLMINI", date = "2009-11-10")
s_attributes(p)
s_attributes(p, "speaker") # get names of speakers

# Get s-attributes occurring in a call
s_attributes(quote(grep("Merkel", speaker)), corpus = "GERMAPARLMINI")
s_attributes(quote(speaker == "Angela Merkel"), corpus = "GERMAPARLMINI")
s_attributes(quote(speaker != "Angela Merkel"), corpus = "GERMAPARLMINI")
s_attributes(
  quote(speaker == "Angela Merkel" & date == "2009-10-28"),
  corpus = "GERMAPARLMINI"
)

# Get s-attributes from quosure
s_attributes(
  rlang::new_quosure(quote(grep("Merkel", speaker))),
  corpus = "GERMAPARLMINI"
)

[Package polmineR version 0.8.9 Index]