re {mclm}R Documentation

Build a regular expression

Description

Create an object of class re or coerce a character vector to an object of class re.

Usage

re(x, perl = TRUE, ...)

as_re(x, perl = TRUE, ...)

as.re(x, perl = TRUE, ...)

Arguments

x

Character vector of length one. The value of this character vector is assumed to be a well-formed regular expression. In the current implementation this is assumed, not checked.

perl

Logical. If TRUE, x is assumed to use PCRE (i.e. Perl Compatible Regular Expressions) notation. If FALSE, x is assumed to use base R's default regular expression notation. Contrary to base R's regular expression functions, re() assumes that the PCRE regular expression flavor is used by default.

...

Additional arguments.

Details

This class exists because some functions in the mclm package require their arguments to be marked as being regular expressions. For example, keep_re() does not need its pattern argument to be a re object, but if the user wants to subset items with brackets using a regular expression, they must use a re object.

Value

An object of class re, which is a wrapper around a character vector flagging it as containing a regular expression. In essence it is a named list: the x item contains the x input and the perl item contains the value of the perl argument (TRUE by default).

It has basic methods such as print(), summary() and as.character().

See Also

perl_flavor(), scan_re(), cat_re()

Examples

toy_corpus <- "Once upon a time there was a tiny toy corpus.
  It consisted of three sentences. And it lived happily ever after."

(tks <- tokenize(toy_corpus))

# In `keep_re()`, the use of `re()` is optional
keep_re(tks, re("^.{3,}"))
keep_re(tks, "^.{3,}")

# When using brackets notation, `re()` is necessary
tks[re("^.{3,}")]
tks["^.{3,}"]

# build and print a `re` object
re("^.{3,}")
as_re("^.{3,}")
as.re("^.{3,}")
print(re("^.{3,}"))

[Package mclm version 0.2.7 Index]