| index {quanteda} | R Documentation |
Locate a pattern in a tokens object
Description
Locates a pattern within a tokens object, returning the index positions of the beginning and ending tokens in the pattern.
Usage
index(
x,
pattern,
valuetype = c("glob", "regex", "fixed"),
case_insensitive = TRUE
)
is.index(x)
Arguments
x |
an input tokens object |
pattern |
a character vector, list of character vectors, dictionary, or collocations object. See pattern for details. |
valuetype |
the type of pattern matching: |
case_insensitive |
logical; if |
Value
a data.frame consisting of one row per pattern match, with columns
for the document name, index positions from and to, and the pattern
matched.
is.index returns TRUE if the object was created by
index(); FALSE otherwise.
Examples
toks <- tokens(data_corpus_inaugural[1:8])
index(toks, pattern = "secure*")
index(toks, pattern = c("secure*", phrase("united states"))) |> head()
[Package quanteda version 4.0.2 Index]