m {caroline}R Documentation

Regexp Match Operator

Description

A grep/sub-like function that returns one or more back-referenced pattern matches in the form of a vector or as columns in a dataframe (respectively). Unlike sub, this function is more geared towards data extraction rather than data cleaning. The name is derived from the popular PERL regular expression 'match' operator function 'm' (eg. 'extraction =~ m/sought_text/').

Usage

m(pattern, vect, names="V", types="character", mismatch=NA, ...)

Arguments

pattern

A regular expression pattern with at least one back reference.

vect

A string or vector of strings one which to apply the pattern match.

names

The vector of names of the new variables to be created out of vect. Must be the same length as vect.

types

The vector of types of the new variables to be created out of vect. Must be the same length as vect.

mismatch

What do to when no pattern is found. NA returns NA, TRUE returns original value (currently only implimented for single match, vector returns)

...

other parameters passed on to grep

Value

Either a vector or a dataframe depending on the number of backreferences in the pattern.

See Also

sub, gsub, regexpr, grep, gregexpr.

Examples


## single vector output examples
m(pattern="asdf.([A-Z]{4}).", 
  vect=c('asdf.AS.fds','asdf.ABCD.asdf', '12.ASDF.asdf','asdf.REWQ.123'))


Rurls <- c('http://www.r-project.org',    'http://cran.r-project.org',
           'http://journal.r-project.org','http://developer.r-project.org')
m(pattern="http://([a-z]+).r-project.org", vect=Rurls)


# dataframe output examples

data(mtcars)
m(pattern="^([A-Za-z]+) ?(.*)$", 
  vect=rownames(mtcars), names=c('make','model'), types=rep('character',2))



[Package caroline version 0.9.2 Index]