ore_search {ore} | R Documentation |
Search for matches to a regular expression
Description
Search a character vector, or the content of a file or connection, for one
or more matches to an Oniguruma-compatible regular expression. Printing and
indexing methods are available for the results. ore_match
is an alias
for ore_search
.
Usage
ore_search(regex, text, all = FALSE, start = 1L, simplify = TRUE,
incremental = !all)
is_orematch(x)
## S3 method for class 'orematch'
x[j, k, ...]
## S3 method for class 'orematches'
x[i, j, k, ...]
## S3 method for class 'orematch'
print(x, lines = getOption("ore.lines", 0L),
context = getOption("ore.context", 30L), width = getOption("width", 80L),
...)
## S3 method for class 'orematches'
print(x, lines = getOption("ore.lines", 0L), simplify = TRUE, ...)
Arguments
regex |
A single character string or object of class |
text |
A vector of strings to match against, or a connection, or the
result of a call to |
all |
If |
start |
An optional vector of offsets (in characters) at which to start
searching. Will be recycled to the length of |
simplify |
If |
incremental |
If |
x |
An R object. |
j |
For indexing, the match number. |
k |
For indexing, the group number. |
... |
For |
i |
For indexing into an |
lines |
The maximum number of lines to print. The default is zero,
meaning no limit. For |
context |
The number of characters of context to include either side of each match. |
width |
The number of characters in each line of printed output. |
Value
For ore_search
, an "orematch"
object, or a list of
the same, each with elements
- text
A copy of the
text
element for the current match, if it was a character vector; otherwise a single string with the content retrieved from the file or connection. If the source was a binary file (fromore_file(..., binary=TRUE)
) then this element will beNULL
.- nMatches
The number of matches found.
- offsets
The offsets (in characters) of each match.
- byteOffsets
The offsets (in bytes) of each match.
- lengths
The lengths (in characters) of each match.
- byteLengths
The lengths (in bytes) of each match.
- matches
The matched substrings.
- groups
Equivalent metadata for each parenthesised subgroup in
regex
, in a series of matrices. If named groups are present in the regex thendimnames
will be set appropriately.
For is_orematch
, a logical vector indicating whether the specified
object has class "orematch"
. For extraction with one index, a
vector of matched substrings. For extraction with two indices, a vector
or matrix of substrings corresponding to captured groups.
Note
Only named *or* unnamed groups will currently be captured, not both. If there are named groups in the pattern, then unnamed groups will be ignored.
By default the print
method uses the crayon
package (if it is
available) to determine whether or not the R terminal supports colour.
Alternatively, colour printing may be forced or disabled by setting the
"ore.colour"
(or "ore.color"
) option to a logical value.
See Also
ore
for creating regex objects; matches
and groups
for an alternative to indexing for extracting
matching substrings.
Examples
# Pick out pairs of consecutive word characters
match <- ore_search("(\\w)(\\w)", "This is a test", all=TRUE)
# Find the second matched substring ("is", from "This")
match[2]
# Find the content of the second group in the second match ("s")
match[2,2]