compare_conditions {headliner} | R Documentation |
Compare two conditions within a data frame
Description
Using logic that filter
can interpret,
compare_conditions()
will summarize the data aggregating condition x
and
condition y
Usage
compare_conditions(df, x, y, .cols = everything(), .fns = lst(mean))
Arguments
df |
data frame |
x |
condition for comparison, same criteria you would use in 'dplyr::filter', used in contrast to the reference group 'y' |
y |
condition for comparison, same criteria you would use in 'dplyr::filter', used in contrast to the reference group 'x' |
.cols |
columns to use in comparison |
.fns |
named list of the functions to use, ex: list(avg = mean, sd = sd) 'purrr' style phrases are also supported like list(mean = ~mean(.x, na.rm = TRUE), sd = sd) and dplyr::lst(mean, sd) will create a list(mean = mean, sd = sd) |
Details
compare_conditions()
passes its arguments to
across
. The .cols
and .fns
work the same. For
clarity, it is helpful to use the lst
function for the
.fns
parameter. Using
compare_conditions(..., .cols = my_var, .fns = lst(mean, sd))
will return
the values mean_my_var_x
, mean_my_var_y
, sd_my_var_x
and sd_my_var_x
Value
Returns a data frame that is either 1 row, or if grouped, 1 row per group.
Examples
# compare_conditions works similar to dplyr::across()
pixar_films |>
compare_conditions(
x = (rating == "G"),
y = (rating == "PG"),
.cols = rotten_tomatoes
)
# because data frames are just fancy lists, you pass the result to headline_list()
pixar_films |>
compare_conditions(
x = (rating == "G"),
y = (rating == "PG"),
.cols = rotten_tomatoes
) |>
headline_list("a difference of {delta} points")
# you can return multiple objects to compare
# 'view_List()' is a helper to see list objects in a compact way
pixar_films |>
compare_conditions(
x = (rating == "G"),
y = (rating == "PG"),
.cols = c(rotten_tomatoes, metacritic),
.fns = dplyr::lst(mean, sd)
) |>
view_list()
# you can use any of the `tidyselect` helpers
pixar_films |>
compare_conditions(
x = (rating == "G"),
y = (rating == "PG"),
.cols = dplyr::starts_with("bo_")
)
# if you want to compare x to the overall average, use y = TRUE
pixar_films |>
compare_conditions(
x = (rating == "G"),
y = TRUE,
.cols = rotten_tomatoes
)
# to get the # of observations use length() instead of n()
# note: don't pass the parentheses
pixar_films |>
compare_conditions(
x = (rating == "G"),
y = (rating == "PG"),
.cols = rotten_tomatoes, # can put anything here really
.fns = list(n = length)
)
# you can also use purrr-style lambdas
pixar_films |>
compare_conditions(
x = (rating == "G"),
y = (rating == "PG"),
.cols = rotten_tomatoes,
.fns = list(avg = ~ sum(.x) / length(.x))
)
# you can compare categorical data with functions like dplyr::n_distinct()
pixar_films |>
compare_conditions(
x = (rating == "G"),
y = (rating == "PG"),
.cols = film,
.fns = list(distinct = dplyr::n_distinct)
)