cl_rework {RcppCWB} | R Documentation |
Low-level CL access.
Description
Wrappers for CWB Corpus Library functions suited for writing performance code.
Usage
s_attr(corpus, s_attribute, registry)
p_attr(corpus, p_attribute, registry)
p_attr_size(p_attr)
s_attr_size(s_attr)
p_attr_lexicon_size(p_attr)
cpos_to_struc(s_attr, cpos)
cpos_to_str(p_attr, cpos)
cpos_to_id(p_attr, cpos)
struc_to_cpos(s_attr, struc)
struc_to_str(s_attr, struc)
regex_to_id(p_attr, regex)
str_to_id(p_attr, str)
id_to_freq(p_attr, id)
id_to_cpos(p_attr, id)
cpos_to_lbound(s_attr, cpos)
cpos_to_rbound(s_attr, cpos)
Arguments
corpus |
ID of a CWB corpus (length-one |
s_attribute |
A structural attribute (length-one |
registry |
Registry directory. |
p_attribute |
A positional attribute (length-one |
p_attr |
A |
s_attr |
A |
cpos |
An |
struc |
A length-one |
regex |
A regular expression. |
str |
A |
id |
An |
Details
The default cl_* R wrappers for the functions of the CWB Corpus Library
involve a lookup of a corpus and its p- or s-attributes (using the corpus ID,
registry and attribute indicated by length-one character vectors) every time
one of these functions is called. It is more efficient looking up an
attribute only once. This set of functions passes "externalptr" classes to
reference attributes that have been looked up. A relevant scenario is writing
functions with a C++ implementation that are compiled and linked using
Rcpp::cppFunction()
or Rcpp::sourceCpp()
Examples
library(Rcpp)
cppFunction(
'Rcpp::StringVector get_str(
SEXP corpus,
SEXP p_attribute,
SEXP registry,
Rcpp::IntegerVector cpos
){
SEXP attr;
Rcpp::StringVector result;
attr = RcppCWB::p_attr(corpus, p_attribute, registry);
result = RcppCWB::cpos_to_str(attr, cpos);
return(result);
}',
depends = "RcppCWB"
)
result <- get_str("REUTERS", "word", RcppCWB::get_tmp_registry(), 0:50)