starts_with {tidyselect} | R Documentation |
Select variables that match a pattern
Description
These selection helpers match variables according to a given pattern.
-
starts_with()
: Starts with an exact prefix. -
ends_with()
: Ends with an exact suffix. -
contains()
: Contains a literal string. -
matches()
: Matches a regular expression. -
num_range()
: Matches a numerical range like x01, x02, x03.
Usage
starts_with(match, ignore.case = TRUE, vars = NULL)
ends_with(match, ignore.case = TRUE, vars = NULL)
contains(match, ignore.case = TRUE, vars = NULL)
matches(match, ignore.case = TRUE, perl = FALSE, vars = NULL)
num_range(prefix, range, suffix = "", width = NULL, vars = NULL)
Arguments
match |
A character vector. If length > 1, the union of the matches is taken. For |
ignore.case |
If |
vars |
A character vector of variable names. If not supplied,
the variables are taken from the current selection context (as
established by functions like |
perl |
Should Perl-compatible regexps be used? |
prefix , suffix |
A prefix/suffix added before/after the numeric range. |
range |
A sequence of integers, like |
width |
Optionally, the "width" of the numeric range. For example, a range of 2 gives "01", a range of three "001", etc. |
Examples
Selection helpers can be used in functions like dplyr::select()
or tidyr::pivot_longer()
. Let's first attach the tidyverse:
library(tidyverse) # For better printing iris <- as_tibble(iris)
starts_with()
selects all variables matching a prefix and
ends_with()
matches a suffix:
iris %>% select(starts_with("Sepal")) #> # A tibble: 150 x 2 #> Sepal.Length Sepal.Width #> <dbl> <dbl> #> 1 5.1 3.5 #> 2 4.9 3 #> 3 4.7 3.2 #> 4 4.6 3.1 #> # i 146 more rows iris %>% select(ends_with("Width")) #> # A tibble: 150 x 2 #> Sepal.Width Petal.Width #> <dbl> <dbl> #> 1 3.5 0.2 #> 2 3 0.2 #> 3 3.2 0.2 #> 4 3.1 0.2 #> # i 146 more rows
You can supply multiple prefixes or suffixes. Note how the order of variables depends on the order of the suffixes and prefixes:
iris %>% select(starts_with(c("Petal", "Sepal"))) #> # A tibble: 150 x 4 #> Petal.Length Petal.Width Sepal.Length Sepal.Width #> <dbl> <dbl> <dbl> <dbl> #> 1 1.4 0.2 5.1 3.5 #> 2 1.4 0.2 4.9 3 #> 3 1.3 0.2 4.7 3.2 #> 4 1.5 0.2 4.6 3.1 #> # i 146 more rows iris %>% select(ends_with(c("Width", "Length"))) #> # A tibble: 150 x 4 #> Sepal.Width Petal.Width Sepal.Length Petal.Length #> <dbl> <dbl> <dbl> <dbl> #> 1 3.5 0.2 5.1 1.4 #> 2 3 0.2 4.9 1.4 #> 3 3.2 0.2 4.7 1.3 #> 4 3.1 0.2 4.6 1.5 #> # i 146 more rows
contains()
selects columns whose names contain a word:
iris %>% select(contains("al")) #> # A tibble: 150 x 4 #> Sepal.Length Sepal.Width Petal.Length Petal.Width #> <dbl> <dbl> <dbl> <dbl> #> 1 5.1 3.5 1.4 0.2 #> 2 4.9 3 1.4 0.2 #> 3 4.7 3.2 1.3 0.2 #> 4 4.6 3.1 1.5 0.2 #> # i 146 more rows
starts_with()
, ends_with()
, and contains()
do not use regular expressions. To select with a
regexp use matches()
:
# [pt] is matched literally: iris %>% select(contains("[pt]al")) #> # A tibble: 150 x 0 # [pt] is interpreted as a regular expression iris %>% select(matches("[pt]al")) #> # A tibble: 150 x 4 #> Sepal.Length Sepal.Width Petal.Length Petal.Width #> <dbl> <dbl> <dbl> <dbl> #> 1 5.1 3.5 1.4 0.2 #> 2 4.9 3 1.4 0.2 #> 3 4.7 3.2 1.3 0.2 #> 4 4.6 3.1 1.5 0.2 #> # i 146 more rows
starts_with()
selects all variables starting with a prefix. To
select a range, use num_range()
. Compare:
billboard %>% select(starts_with("wk")) #> # A tibble: 317 x 76 #> wk1 wk2 wk3 wk4 wk5 wk6 wk7 wk8 wk9 wk10 wk11 wk12 wk13 #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 87 82 72 77 87 94 99 NA NA NA NA NA NA #> 2 91 87 92 NA NA NA NA NA NA NA NA NA NA #> 3 81 70 68 67 66 57 54 53 51 51 51 51 47 #> 4 76 76 72 69 67 65 55 59 62 61 61 59 61 #> # i 313 more rows #> # i 63 more variables: wk14 <dbl>, wk15 <dbl>, wk16 <dbl>, wk17 <dbl>, #> # wk18 <dbl>, wk19 <dbl>, wk20 <dbl>, wk21 <dbl>, ... billboard %>% select(num_range("wk", 10:15)) #> # A tibble: 317 x 6 #> wk10 wk11 wk12 wk13 wk14 wk15 #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 NA NA NA NA NA NA #> 2 NA NA NA NA NA NA #> 3 51 51 51 47 44 38 #> 4 61 61 59 61 66 72 #> # i 313 more rows
See Also
The selection language page, which includes links to other selection helpers.