assert_ids {assertable} | R Documentation |
Assert that a data.frame contains all unique combinations of specified ID variables, and doesn't contain duplicates within combinations
Description
Given a data.frame or data.table object and a named list of id_vars, assert that all possible combinations of id_vars exist in the dataset,
that no combinations of id_vars exist in the dataset but not in id_vars,
and that there are no duplicate values within the dataset within unique combinations of id_vars.
If ids_only = T and assert_dups = T, returns all combinations of id_vars along with the n_duplicates: the count of duplicates within each combination.
If ids_only = F, returns all duplicate observations from the original dataset along with n_duplicates
and duplicate_id: a unique ID for each duplicate value within each combination of id_vars.
Usage
assert_ids(data, id_vars, assert_combos = TRUE, assert_dups = TRUE,
ids_only = TRUE, warn_only = FALSE, quiet = FALSE)
Arguments
data |
A data.frame or data.table |
id_vars |
A named list of vectors, where the name of each vector must correspond to a column in data |
assert_combos |
Assert that the data object must contain all combinations of id_vars. Default = T. |
assert_dups |
Assert that the data object must not contain duplicate values within any combinations of id_vars. Default = T. |
ids_only |
By default, with assert_dups = T, the function returns the unique combinations of id_vars that have duplicate observations. If ids_only = F, will return every observation in the original dataset that are duplicates. |
warn_only |
Do you want to warn, rather than error? Will return all offending rows from the first violation of the assertion. Default=F. |
quiet |
Do you want to suppress the printed message when a test is passed? Default = F. |
Details
Note: if assert_combos = T and is violated, then assert_ids will stop execution and return results for assert_combos before evaluating the assert_dups segment of the code. If you want to make sure both options are evaluated even in case of a violation in assert_combos, call assert_ids twice (once with assert_dups = F, then assert_combos = F) with warn_only = T, and then conditionally stop your code if either call returns results.
Value
Throws error if test is violated. Will print the offending rows. If warn_only=T, will return all offending rows and only warn.
Examples
plants <- as.character(unique(CO2$Plant))
concs <- unique(CO2$conc)
ids <- list(Plant=plants,conc=concs)
assert_ids(CO2, ids)