R: Extract Elements From a (Atomic) Vector

pick {str2str}

R Documentation

Extract Elements From a (Atomic) Vector

Description

pick extracts the elements from a (atomic) vector that meet certain criteria: 1) using exact values or regular expressions (pat), 2) inclusion vs. exclusion of the value/expression (not), 3) based on elements or names (nm). Primarily for character vectors, but can be used with other typeof.

Usage

pick(x, val, pat = FALSE, not = FALSE, nm = FALSE, fixed = FALSE)

Arguments

`x`	atomic vector or an object with names (e.g., data.frame) if `nm` = TRUE.
`val`	atomic vector specifying which elements of `x` will be extracted. If `pat` = FALSE (default), then `val` should be an atomic vector of the same typeof as `x`, can have length > 1, and exact matching will be done via `is.element` (essentially `match`). If `pat` = TRUE, then `val` has to be a character vector of length 1 and partial matching will be done via `grepl` with the option of regular expressions if `fixed` = FALSE (default). Note, if `nm` = TRUE, then `val` should refer to names of `x` to determine which elements of `x` should be extracted.
`pat`	logical vector of length 1 specifying whether `val` should refer to exact matching (FALSE) via `is.element` (essentially `match`) or partial matching (TRUE) and/or use of regular expressions via `grepl`. See details for a brief description of some common symbols and `help(regex)` for more.
`not`	logical vector of length 1 specifying whether `val` indicates values that should be retained (FALSE) or removed (TRUE).
`nm`	logical vector of length 1 specifying whether `val` refers to the names of `x` (TRUE) rather than the elements of `x` themselves (FALSE).
`fixed`	logical vector of length 1 specifying whether `val` refers to values as is (TRUE) or a regular expression (FALSE). Only used if `pat` = TRUE.

Details

pick allows for 8 different ways to extract elements from a (atomic) vector created by the 2x2x2 combination of logical arguments pat, not, and nm. When pat = FALSE (default), pick uses is.element (essentially match) and requires exact matching of val in x. When pat = TRUE, pick uses grepl and allows for partial matching of val in x and/or regular expressions if fixed = FALSE (default).

When dealing with regular expressions via pat = TRUE and fixed = FALSE, certain symbols within val are not interpreted as literal characters and instead have special meanings. Some of the most commonly used symbols are . = any character, "|" = logical or, "^" = starts with, "\n" = new line, "\t" = tab.

Value

a subset of x that only includes the elements which meet the criteria specified by the function call.

Examples

# pedagogical cases
chr <- setNames(object = c("one","two","three","four","five"), nm = as.character(1:5))
# 1) pat = FALSE, not = FALSE, nm = FALSE
pick(x = chr, val = c("one","five"), pat = FALSE, not = FALSE, nm = FALSE)
# 2) pat = FALSE, not = FALSE, nm = TRUE
pick(x = chr, val = c("1","5"), pat = FALSE, not = FALSE, nm = TRUE)
# 3) pat = FALSE, not = TRUE, nm = FALSE
pick(x = chr, val = c("two","three","four"), pat = FALSE, not = TRUE, nm = FALSE)
# 4) pat = FALSE, not = TRUE, nm = TRUE
pick(x = chr, val = c("2","3","4"), pat = FALSE, not = TRUE, nm = TRUE)
# 5) pat = TRUE, not = FALSE, nm = FALSE
pick(x = chr, val = "n|v", pat = TRUE, not = FALSE, nm = FALSE)
# 6) pat = TRUE, not = FALSE, nm = TRUE
pick(x = chr, val = "1|5", pat = TRUE, not = FALSE, nm = TRUE)
# 7) pat = TRUE, not = TRUE, nm = FALSE
pick(x = chr, val = "t|r", pat = TRUE, not = TRUE, nm = FALSE)
# 8) pat = TRUE, not = TRUE, nm = TRUE
pick(x = chr, val = c("2|3|4"), pat = TRUE, not = TRUE, nm = TRUE)
datasets <- data()[["results"]][, "Item"]
# actual use cases
pick(x = datasets, val = c("attitude","mtcars","airquality"),
   not = TRUE) # all but the three most common datasets used in `str2str` package examples
pick(x = datasets, val = "state", pat = TRUE) # only datasets that contain "state"
pick(x = datasets, val = "state.*state", pat = TRUE) # only datasets that have
   # "state" twice in their name
pick(x = datasets, val = "US|UK", pat = TRUE) # only datasets that contain
   # "US" or "UK"
pick(x = datasets, val = "^US|^UK", pat = TRUE) # only datasets that start with
   # "US" or "UK"
pick(x = datasets, val = "k.*o|o.*k", pat = TRUE) # only datasets containing both
   # "k" and "o"

[Package str2str version 1.0.0 Index]