find_columns {datawizard} | R Documentation |
Find or get columns in a data frame based on search patterns
Description
find_columns()
returns column names from a data set that
match a certain search pattern, while get_columns()
returns the found data.
data_select()
is an alias for get_columns()
, and data_find()
is an alias
for find_columns()
.
Usage
find_columns(
data,
select = NULL,
exclude = NULL,
ignore_case = FALSE,
regex = FALSE,
verbose = TRUE,
...
)
data_find(
data,
select = NULL,
exclude = NULL,
ignore_case = FALSE,
regex = FALSE,
verbose = TRUE,
...
)
get_columns(
data,
select = NULL,
exclude = NULL,
ignore_case = FALSE,
regex = FALSE,
verbose = TRUE,
...
)
data_select(
data,
select = NULL,
exclude = NULL,
ignore_case = FALSE,
regex = FALSE,
verbose = TRUE,
...
)
Arguments
data |
A data frame. |
select |
Variables that will be included when performing the required tasks. Can be either
If |
exclude |
See |
ignore_case |
Logical, if |
regex |
Logical, if |
verbose |
Toggle warnings. |
... |
Arguments passed down to other functions. Mostly not used yet. |
Details
Note that it is possible to either pass an entire select helper or only the pattern inside a select helper as a function argument:
foo <- function(data, pattern) { find_columns(data, select = starts_with(pattern)) } foo(iris, pattern = "Sep") foo2 <- function(data, pattern) { find_columns(data, select = pattern) } foo2(iris, pattern = starts_with("Sep"))
This means that it is also possible to use loop values as arguments or patterns:
for (i in c("Sepal", "Sp")) { head(iris) |> find_columns(select = starts_with(i)) |> print() }
However, this behavior is limited to a "single-level function". It will not work in nested functions, like below:
inner <- function(data, arg) { find_columns(data, select = arg) } outer <- function(data, arg) { inner(data, starts_with(arg)) } outer(iris, "Sep")
In this case, it is better to pass the whole select helper as the argument of
outer()
:
outer <- function(data, arg) { inner(data, arg) } outer(iris, starts_with("Sep"))
Value
find_columns()
returns a character vector with column names that matched
the pattern in select
and exclude
, or NULL
if no matching column name
was found. get_columns()
returns a data frame with matching columns.
See Also
Functions to rename stuff:
data_rename()
,data_rename_rows()
,data_addprefix()
,data_addsuffix()
Functions to reorder or remove columns:
data_reorder()
,data_relocate()
,data_remove()
Functions to reshape, pivot or rotate data frames:
data_to_long()
,data_to_wide()
,data_rotate()
Functions to recode data:
rescale()
,reverse()
,categorize()
,recode_values()
,slide()
Functions to standardize, normalize, rank-transform:
center()
,standardize()
,normalize()
,ranktransform()
,winsorize()
Split and merge data frames:
data_partition()
,data_merge()
Functions to find or select columns:
data_select()
,data_find()
Functions to filter rows:
data_match()
,data_filter()
Examples
# Find columns names by pattern
find_columns(iris, starts_with("Sepal"))
find_columns(iris, ends_with("Width"))
find_columns(iris, regex("\\."))
find_columns(iris, c("Petal.Width", "Sepal.Length"))
# starts with "Sepal", but not allowed to end with "width"
find_columns(iris, starts_with("Sepal"), exclude = contains("Width"))
# find numeric with mean > 3.5
numeric_mean_35 <- function(x) is.numeric(x) && mean(x, na.rm = TRUE) > 3.5
find_columns(iris, numeric_mean_35)