cl_attribute_size {RcppCWB}R Documentation

Get Attribute Size (of Positional/Structural Attribute).

Description

Use cl_attribute_size() to get the total number of values of a positional attribute (param attribute_type = "p"), or structural attribute (param attribute_type = "s"). Note that indices are zero-based, i.e. the maximum position of a positional / structural attribute is attribute size minus 1 (see examples).

Usage

cl_attribute_size(
  corpus,
  attribute,
  attribute_type,
  registry = Sys.getenv("CORPUS_REGISTRY")
)

Arguments

corpus

name of a CWB corpus (upper case)

attribute

name of a p- or s-attribute

attribute_type

either "p" or "s", for structural/positional attribute

registry

path to the registry directory, defaults to the value of the environment variable CORPUS_REGISTRY

Examples

token_no <- cl_attribute_size(
  "REUTERS",
  attribute = "word",
  attribute_type = "p",
  registry = get_tmp_registry()
)
corpus_positions <- seq.int(from = 0, to = token_no - 1)
cl_cpos2id(
  "REUTERS",
  "word",
  cpos = corpus_positions,
  registry = get_tmp_registry()
)

places_no <- cl_attribute_size(
  "REUTERS",
  attribute = "places",
  attribute_type = "s",
  registry = get_tmp_registry()
)
strucs <- seq.int(from = 0, to = places_no - 1)
cl_struc2str(
  "REUTERS",
  "places",
  struc = strucs,
  registry = get_tmp_registry()
)

[Package RcppCWB version 0.6.4 Index]