udpipe_spanquote_tqueries {corpustools}R Documentation

Get a list of tqueries for finding candidates for span quotes.


Quote extraction with tqueries is limited to quotes within sentences. When (verbatim) quotes span multiple sentences (which we call span quotes here), they are often indicated with quotation marks. While it is relatively easy to identify these quotes, it is less straightforward to identify the sources of these quotes. A good approach is to first apply tqueries for finding quotes within sentences, because a source mentioned just before (we use 2 sentences) a span quote is often also the source of this span quote. For cases where there is no previous source, we can apply simple queries for finding source candidates. Thats what the tqueries created with the current function are for.


udpipe_spanquote_tqueries(say_verbs = verb_lemma("quote"))



A character vector of verb lemma that indicate speech (e.g., say, state). A default list is included in verb_lemma('quote'), but certain lemma might be more accurate/appropriate depending on the corpus.


This procedure is supported in rsyntax with the add_span_quotes function. In corpustools this function is implemented within the udpipe_quotes method. The current function provides the default tqueries for the span quotes.



[Package corpustools version 0.4.10 Index]