cast_text {rsyntax}R Documentation

Cast annotations to text

Description

Cast labeled tokens to sentences.

Usage

cast_text(tokens, annotation, ..., text_col = "token", na.rm = T)

Arguments

tokens

A tokenIndex

annotation

The name of annotations (the "column" argument in annotate_tqueries)

...

Optionally, group annotations together. Named arguments can be given where the name is the new group, and the value is a character vector with values in the annotation column. For example, text = c('verb','predicate') would group the 'verb' and 'predicate' nodes together under the name 'text'.

text_col

The name of the column in tokens with the text. Usually this is "token", but some parsers use alternatives such as 'word'.

na.rm

If true (default), drop tokens where annotation id is NA (i.e. tokens without labels)

Value

a data.table

Examples

tokens = tokens_spacy[tokens_spacy$doc_id == 'text3',]

## two simple example tqueries
passive = tquery(pos = "VERB*", label = "verb", fill=FALSE,
                 children(relation = "agent",
                          children(label="subject")),
                 children(relation = "nsubjpass", label="object"))
active =  tquery(pos = "VERB*", label = "verb", fill=FALSE,
                 children(relation = c("nsubj", "nsubjpass"), label = "subject"),
                 children(relation = "dobj", label="object"))

tokens = annotate_tqueries(tokens, "clause", pas=passive, act=active, overwrite=T)

cast_text(tokens, 'clause')

## group annotations
cast_text(tokens, 'clause', text = c('verb','object'))

## use grouping to sort
cast_text(tokens, 'clause', subject = 'subject', 
                            verb = 'verb', object = 'object')

[Package rsyntax version 0.1.4 Index]