expose {ruler}R Documentation

Expose data to rule packs

Description

Function for applying rule packs to data.

Usage

expose(.tbl, ..., .rule_sep = inside_punct("\\._\\."),
  .remove_obeyers = TRUE, .guess = TRUE)

Arguments

.tbl

Data frame of interest.

...

Rule packs. They can be in pure form or inside a list (at any depth).

.rule_sep

Regular expression used as separator between column and rule names in col packs and cell packs.

.remove_obeyers

Whether to remove elements which obey rules from report.

.guess

Whether to guess type of unsupported rule pack type (see Details).

Details

expose() applies all supplied rule packs to data, creates an exposure object based on results and stores it to attribute 'exposure'. It is guaranteed that .tbl is not modified in any other way in order to use expose() inside a pipe.

It is a good idea to name all rule packs: explicitly in ... (if they are supplied not inside list) or during creation with respective rule pack function. In case of missing name it is imputed based on possibly existing exposure attribute in .tbl and supplied rule packs. Imputation is similar to one in rules() but applied to every pack type separately.

Default value for .rule_sep is the regular expression ⁠characters ._. surrounded by non alphanumeric characters⁠. It is picked to be used smoothly with dplyr's scoped verbs and rules() instead of pure list. In most cases it shouldn't be changed but if needed it should align with .prefix in rules().

Value

A .tbl with possibly added 'exposure' attribute containing the resulting exposure. If .tbl already contains 'exposure' attribute then the result is binded with it.

Guessing

To work properly in some edge cases one should specify pack types with appropriate function. However with .guess equals to TRUE expose will guess the pack type based on its output after applying to .tbl. It uses the following features:

Examples

my_rule_pack <- . %>% dplyr::summarise(nrow_neg = nrow(.) < 0)
my_data_packs <- data_packs(my_data_pack_1 = my_rule_pack)

# These pipes give identical results
mtcars %>%
  expose(my_data_packs) %>%
  get_report()

mtcars %>%
  expose(my_data_pack_1 = my_rule_pack) %>%
  get_report()

# This throws an error because no pack type is specified for my_rule_pack
## Not run: 
mtcars %>% expose(my_data_pack_1 = my_rule_pack, .guess = FALSE)

## End(Not run)

# Edge cases against using 'guess = TRUE' for robust code
group_rule_pack <- . %>%
  dplyr::mutate(vs_one = vs == 1) %>%
  dplyr::group_by(vs_one, am) %>%
  dplyr::summarise(n_low = dplyr::n() > 10)
group_rule_pack_dummy <- . %>%
  dplyr::mutate(vs_one = vs == 1) %>%
  dplyr::group_by(mpg, vs_one, wt) %>%
  dplyr::summarise(n_low = dplyr::n() > 10)
row_rule_pack <- . %>% dplyr::transmute(neg_row_sum = rowSums(.) < 0)
cell_rule_pack <- . %>% dplyr::transmute_all(rules(neg_value = . < 0))

# Only column 'am' is guessed as grouping which defines non-unique levels.
## Not run: 
mtcars %>%
  expose(group_rule_pack, .remove_obeyers = FALSE, .guess = TRUE) %>%
  get_report()

## End(Not run)

# Values in `var` should contain combination of three grouping columns but
# column 'vs_one' is guessed as rule. No error is thrown because the guessed
# grouping column define unique levels.
mtcars %>%
  expose(group_rule_pack_dummy, .remove_obeyers = FALSE, .guess = TRUE) %>%
  get_report()

# Results should have in column 'id' value 1 and not 0.
mtcars %>%
  dplyr::slice(1) %>%
  expose(row_rule_pack) %>%
  get_report()

mtcars %>%
  dplyr::slice(1) %>%
  expose(cell_rule_pack) %>%
  get_report()

[Package ruler version 0.3.0 Index]