ore {ore} | R Documentation |
Oniguruma regular expressions
Description
Create, test for, and print objects of class "ore"
, which represent
Oniguruma regular expressions. These are unit-length character vectors with
additional attributes, including a pointer to the compiled version.
Usage
ore(..., options = "", encoding = getOption("ore.encoding"),
syntax = c("ruby", "fixed"))
is_ore(x)
## S3 method for class 'ore'
print(x, ...)
Arguments
... |
One or more strings or dictionary labels, constituting a valid
regular expression after being concatenated together. Elements drawn from
the dictionary will be surrounded by parentheses, turning them into
groups. Note that backslashes should be doubled, to avoid them being
interpreted as character escapes by R. The |
options |
A string composed of characters indicating variations on the
usual interpretation of the regex. These may currently include |
encoding |
A string specifying the encoding that matching will take
place in. The default is given by the |
syntax |
The regular expression syntax being used. The default is
|
x |
An R object. |
Value
The ore
function returns the final pattern, with class
"ore"
and the following attributes:
- .compiled
A low-level pointer to the compiled version of the regular expression.
- options
Options, copied from the argument of the same name.
- encoding
The specified or detected encoding.
- syntax
The specified syntax type.
- nGroups
The number of groups in the pattern.
- groupNames
Group names, if applicable.
The is_ore
function returns a logical vector indicating whether
its argument represents an "ore"
object.
See Also
For full details of supported syntax, please see
https://raw.githubusercontent.com/k-takata/Onigmo/master/doc/RE. The
regex
page is also useful as a quick reference, since
PCRE (used by base R) and Oniguruma (used by ore
) have similar
features. See ore_dict
for details of the pattern dictionary.
Examples
# This matches a positive or negative integer
ore("-?\\d+")
# This matches words of exactly four characters
ore("\\b\\w{4}\\b")