fgsub {textclean} | R Documentation |
Replace a Regex with an Functional Operation on the Regex Match
Description
This is a stripped down version of gsubfn
from the gsubfn
package. It finds a regex match, and then uses a function to operate on
these matches and uses them to replace the original matches. Note that
the stringi packages is used for matching and extracting the regex
matches. For more powerful or flexible needs please see the gsubfn
package.
Usage
fgsub(x, pattern, fun, ...)
Arguments
x |
A character vector. |
pattern |
Character string to be matched in the given character vector. |
fun |
A function to operate on the extracted matches. |
... |
ignored. |
Value
Returns a vector with the pattern replaced.
See Also
Examples
## In this example the regex looks for words that contain a lower case letter
## followed by the same letter at least 2 more times. It then extracts these
## words, splits them appart into letters, reverses the string, pastes them
## back together, wraps them with double angle braces, and then puts them back
## at the original locations.
fgsub(
x = c(NA, 'df dft sdf', 'sd fdggg sd dfhhh d', 'ddd'),
pattern = "\\b\\w*([a-z])(\\1{2,})\\w*\\b",
fun = function(x) {
paste0('<<', paste(rev(strsplit(x, '')[[1]]), collapse =''), '>>')
}
)
## In this example we extract numbers, strip out non-digits, coerce them to
## numeric, cut them in half, round up to the closest integer, add the commas
## back, and replace back into the original locations.
fgsub(
x = c(NA, 'I want 32 grapes', 'he wants 4 ice creams',
'they want 1,234,567 dollars'
),
pattern = "[\\d,]+",
fun = function(x) {
prettyNum(
ceiling(as.numeric(gsub('[^0-9]', '', x))/2),
big.mark = ','
)
}
)
## In this example we extract leading zeros, convert to an equal number of
## spaces.
fgsub(
x = c(NA, "00:04", "00:08", "00:01", "06:14", "00:02", "00:04"),
pattern = '^0+',
fun = function(x) {gsub('0', ' ', x)}
)
[Package textclean version 0.9.3 Index]