cast_text {rsyntax} | R Documentation |
Cast annotations to text
Description
Cast labeled tokens to sentences.
Usage
cast_text(tokens, annotation, ..., text_col = "token", na.rm = T)
Arguments
tokens |
A tokenIndex |
annotation |
The name of annotations (the "column" argument in annotate_tqueries) |
... |
Optionally, group annotations together. Named arguments can be given where the name is the new group, and the value is a character vector with values in the annotation column. For example, text = c('verb','predicate') would group the 'verb' and 'predicate' nodes together under the name 'text'. |
text_col |
The name of the column in tokens with the text. Usually this is "token", but some parsers use alternatives such as 'word'. |
na.rm |
If true (default), drop tokens where annotation id is NA (i.e. tokens without labels) |
Value
a data.table
Examples
tokens = tokens_spacy[tokens_spacy$doc_id == 'text3',]
## two simple example tqueries
passive = tquery(pos = "VERB*", label = "verb", fill=FALSE,
children(relation = "agent",
children(label="subject")),
children(relation = "nsubjpass", label="object"))
active = tquery(pos = "VERB*", label = "verb", fill=FALSE,
children(relation = c("nsubj", "nsubjpass"), label = "subject"),
children(relation = "dobj", label="object"))
tokens = annotate_tqueries(tokens, "clause", pas=passive, act=active, overwrite=T)
cast_text(tokens, 'clause')
## group annotations
cast_text(tokens, 'clause', text = c('verb','object'))
## use grouping to sort
cast_text(tokens, 'clause', subject = 'subject',
verb = 'verb', object = 'object')
[Package rsyntax version 0.1.4 Index]