gregexpr {microseq}R Documentation

Extended gregexpr with substring retrieval

Description

An extension of the function base::gregexpr enabling retrieval of the matching substrings.

Usage

gregexpr(
  pattern,
  text,
  ignore.case = FALSE,
  perl = FALSE,
  fixed = FALSE,
  useBytes = FALSE,
  extract = FALSE
)

Arguments

pattern

Character string containing a regular expression (or character string for fixed = TRUE) to be matched in the given character vector. Coerced by as.character to a character string if possible. If a character vector of length 2 or more is supplied, the first element is used with a warning. Missing values are not allowed.

text

A character vector where matches are sought, or an object which can be coerced by as.character to a character vector.

ignore.case

If FALSE, the pattern matching is case sensitive and if TRUE, case is ignored during matching.

perl

Logical. Should perl-compatible regexps be used? Has priority over extended.

fixed

Logical. If TRUE, ‘⁠pattern⁠’ is a string to be matched as is. Overrides all conflicting arguments.

useBytes

Logical. If TRUE the matching is done byte-by-byte rather than character-by-character. See grep for details.

extract

Logical indicating if matching substrings should be extracted and returned.

Details

Extended version of base:gregexpr that enables the return of the substrings matching the pattern. The last argument ‘⁠extract⁠’ is the only difference to base::gregexpr. The default behaviour is identical to base::gregexpr, but setting extract=TRUE means the matching substrings are returned.

Value

It will either return what the base::gregexpr would (extract = FALSE) or a ‘⁠list⁠’ of substrings matching the pattern (extract = TRUE). There is one ‘⁠list⁠’ element for each string in ‘⁠text⁠’, and each list element contains a character vector of all matching substrings in the corresponding entry of ‘⁠text⁠’.

Author(s)

Lars Snipen and Kristian Liland.

See Also

grep

Examples

sequences <- c("ACATGTCATGTCC", "CTTGTATGCTG")
gregexpr("ATG", sequences, extract = TRUE)


[Package microseq version 2.1.6 Index]