re_match {rematch2} | R Documentation |
Extract Regular Expression Matches Into a Data Frame
Description
re_match
wraps regexpr
and returns the
match results in a convenient data frame. The data frame has one
column for each capture group if perl=TRUE
, and one final columns
called .match
for the matching (sub)string. The columns of the capture
groups are named if the groups themselves are named.
Usage
re_match(text, pattern, perl = TRUE, ...)
Arguments
text |
Character vector. |
pattern |
A regular expression. See |
perl |
logical should perl compatible regular expressions be used? Defaults to TRUE, setting to FALSE will disable capture groups. |
... |
Additional arguments to pass to |
Value
A data frame of character vectors: one column per capture
group, named if the group was named, and additional columns for
the input text and the first matching (sub)string. Each row
corresponds to an element in the text
vector.
Note
re_match
uses PCRE compatible regular expressions by default
(i.e. perl = TRUE
in regexpr
). You can switch
this off but if you do so capture groups will no longer be reported as they
are only supported by PCRE.
See Also
Other tidy regular expression matching:
re_exec_all()
,
re_exec()
,
re_match_all()
Examples
dates <- c("2016-04-20", "1977-08-08", "not a date", "2016",
"76-03-02", "2012-06-30", "2015-01-21 19:58")
isodate <- "([0-9]{4})-([0-1][0-9])-([0-3][0-9])"
re_match(text = dates, pattern = isodate)
# The same with named groups
isodaten <- "(?<year>[0-9]{4})-(?<month>[0-1][0-9])-(?<day>[0-3][0-9])"
re_match(text = dates, pattern = isodaten)