str_search {tinycodet}R Documentation

'stringi' Pattern Search Operators

Description

The x %s{}% p operator checks for every string in character vector x if the pattern defined in p is present.
When supplying a list on the right hand side (see s_pattern), one can optionally include the list element at = "start" or at = "end":

The x %s!{}% p operator is the same as x %s{}% p, except it checks for absence of the pattern, rather than presence.

For string (in)equality operators, see %s==% from the 'stringi' package.

strfind()<- locates, extracts, or replaces found patterns.
It complements the other string-related operators, and uses the same s_pattern API.
It functions as follows:

Usage

x %s{}% p

x %s!{}% p

strfind(x, p, ..., i, rt)

strfind(x, p, ..., i, rt) <- value

Arguments

x

a string or character vector.
For ⁠strfind()<-⁠, x must obviously be the variable containing the character vector/string, since ⁠strfind()<-⁠ performs assignment in-place.

p

either a list with 'stringi' arguments (see s_pattern), or else a character vector with regular expressions.
See also the Details section.
[REGEX]
[FIXED]
[COLL]
[CHARCLASS]

...

additional arguments to be specified.

i

either one of the following can be given for i:

  • if i is not given or NULL, strfind() extracts all found pattern occurrences.

  • if i is the string "all", strfind() locates all found pattern occurrences.

  • if i is an integer, strfind() locates the i^{th} pattern occurrences.
    See the i argument in stri_locate_ith for details.

For strfind() <- value, i must not be specified.

rt

use rt to specify the Replacement Type that ⁠strfind()<-⁠ should perform.
Either one of the following can be given for rt:

  • if rt is not given, NULL or "vec", ⁠strfind()<-⁠ performs regular, vectorized replacement of all occurrences.

  • if rt = "dict", ⁠strfind()<-⁠ performs dictionary replacement of all occurrences.

  • if rt = "first", ⁠strfind()<-⁠ replaces only the first occurrences.

  • if rt = "last", ⁠strfind()<-⁠ replaces only the last occurrences.

Note: rt = "first" and rt = "last" only exist for convenience; for more specific locational replacement, use stri_locate_ith or strfind(..., i) with numeric i (see the Examples section).
For strfind(), rt must not be specified.

value

a character vector giving the replacement values.

Details

Right-hand Side List for the %s{}% and %s!{}% Operators
When supplying a list to the right-hand side of the %s{}% and %s!{}% operators, one can add the argument at.
If at = "start", the operators will check if the pattern is present/absent at the start of the string.
If at = "end", the operators will check if the pattern is present/absent at the end of the string.
Unlike stri_startswith or stri_endswith, regex is supported by the %s{}% and %s!{}% operators.
See examples below.

Vectorized Replacement vs Dictionary Replacement

Notice that for single replacement, i.e. rt = "first" or rt = "last", it makes no sense to distinguish between vectorized or dictionary replacement, since then only a single occurrence is being replaced per string.
See examples below.

Value

For the x %s{}% p and x %s!{}% p operators:
Return logical vectors.

For strfind():
Returns a list with extractions of all found patterns.

For strfind(..., i = "all"):
Returns a list with all found pattern locations.

For strfind(..., i = i) with integer vector i:
Returns an integer matrix with two columns, giving the start and end positions of the i^{th} matches, two NAs if no matches are found, and also two NAs if str is NA.

For strfind() <- value:
Returns nothing, but performs in-place replacement (using R's default in-place semantics) of the found patterns in variable x.

Note

⁠strfind()<-⁠ performs in-place replacement.
Therefore, the character vector or string to perform replacement on, must already exist as a variable.
So take for example the following code:

strfind("hello", p = "e") <- "a" # this obviously does not work

y <- "hello"
strfind(y, p = "e") <- "a" # this works fine

In the above code, the first ⁠strfind()<-⁠ call does not work, because the string needs to exist as a variable.

See Also

tinycodet_strings

Examples


# example of %s{}% and %s!{}% ====

x <- c(paste0(letters[1:13], collapse = ""),
       paste0(letters[14:26], collapse = ""))
print(x)
x %s{}% "a"
x %s!{}% "a"
which(x %s{}% "a")
which(x %s!{}% "a")
x[x %s{}% "a"]
x[x %s!{}% "a"]
x[x %s{}% "a"] <- 1
x[x %s!{}% "a"] <- 1
print(x)

x <- c(paste0(letters[1:13], collapse = ""),
       paste0(letters[14:26], collapse = ""))
x %s{}% "1"
x %s!{}% "1"
which(x %s{}% "1")
which(x %s!{}% "1")
x[x %s{}% "1"]
x[x %s!{}% "1"]
x[x %s{}% "1"] <- "a"
x[x %s!{}% "1"] <- "a"
print(x)

#############################################################################


# Example of %s{}% and %s!{}% with "at" argument ====

x <- c(paste0(letters, collapse = ""),
       paste0(rev(letters), collapse = ""), NA)
p <- s_fixed("abc", at = "start")
x %s{}% p
stringi::stri_startswith(x, fixed = "abc") # same as above

p <- s_fixed("xyz", at = "end")
x %s{}% p
stringi::stri_endswith(x, fixed = "xyz") # same as above

p <- s_fixed("cba", at = "end")
x %s{}% p
stringi::stri_endswith(x, fixed = "cba") # same as above

p <- s_fixed("zyx", at = "start")
x %s{}% p
stringi::stri_startswith(x, fixed = "zyx") # same as above



#############################################################################


# Example of transforming ith occurrence ====

# new character vector:
x <- c(paste0(letters[1:13], collapse = ""),
       paste0(letters[14:26], collapse = ""))
print(x)

# report ith (second and second-last) vowel locations:
p <- s_regex( # vowels
  rep("A|E|I|O|U", 2),
  case_insensitive = TRUE
)
loc <- strfind(x, p, i = c(2, -2))
print(loc)

# extract ith vowels:
extr <- stringi::stri_sub(x, from = loc)
print(extr)

# replace ith vowels with numbers:
repl <- chartr("aeiou", "12345", extr) # transformation
stringi::stri_sub(x, loc) <- repl
print(x)


#############################################################################


# Example of strfind for regular vectorized replacement ====

x <- rep('The quick brown fox jumped over the lazy dog.', 3)
print(x)
p <- c('quick', 'brown', 'fox')
rp <- c('SLOW',  'BLACK', 'BEAR')
x %s{}% p
strfind(x, p)
strfind(x, p) <- rp
print(x)

#############################################################################


# Example of strfind for dictionary replacement ====

x <- rep('The quick brown fox jumped over the lazy dog.', 3)
print(x)
p <- c('quick', 'brown', 'fox')
rp <- c('SLOW',  'BLACK', 'BEAR')
# thus dictionary is:
# quick => SLOW; brown => BLACK; fox => BEAR
strfind(x, p, rt = "dict") <- rp
print(x)


#############################################################################


# Example of strfind for first and last replacement ====

x <- rep('The quick brown fox jumped over the lazy dog.', 3)
print(x)
p <- s_fixed("the", case_insensitive = TRUE)
rp <- "One"
strfind(x, p, rt = "first") <- rp
print(x)

x <- rep('The quick brown fox jumped over the lazy dog.', 3)
print(x)
p <- s_fixed("the", case_insensitive = TRUE)
rp <- "Some Other"
strfind(x, p, rt = "last") <- rp
print(x)





[Package tinycodet version 0.5.3 Index]