hierarchy {validate} | R Documentation |
Check aggregates defined by a hierarchical code list
Description
Check all aggregates defined by a code hierarchy.
Usage
hierarchy(
values,
labels,
hierarchy,
by = NULL,
tol = 1e-08,
na_value = TRUE,
aggregator = sum,
...
)
Arguments
values |
bare (unquoted) name of a variable that holds values that
must aggregate according to the |
labels |
bare (unquoted) name of variable holding a grouping variable (a code from a hierarchical code list) |
hierarchy |
|
by |
A bare (unquoted) variable or list of variable names that occur in the data under scrutiny. The data will be split into groups according to these variables and the check is performed on each group. |
tol |
|
na_value |
|
aggregator |
|
... |
arguments passed to |
Value
A logical
vector with the size of length(values)
. Every
element involved in an aggregation error is labeled FALSE
(aggregate
plus aggregated elements). Elements that are involved in correct
aggregations are set to TRUE
, elements that are not involved in
any check get the value na_value
(by default: TRUE
).
See Also
Other cross-record-helpers:
contains_exactly()
,
do_by()
,
exists_any()
,
hb()
,
is_complete()
,
is_linear_sequence()
,
is_unique()
Examples
# We check some data against the built-in NACE revision 2 classification.
data(nace_rev2)
head(nace_rev2[1:4]) # columns 3 and 4 contain the child-parent relations.
d <- data.frame(
nace = c("01","01.1","01.11","01.12", "01.2")
, volume = c(100 ,70 , 30 ,40 , 25 )
)
# It is possible to perform checks interactively
d$nacecheck <- hierarchy(d$volume, labels = d$nace, hierarchy=nace_rev2[3:4])
# we have that "01.1" == "01.11" + "01.12", but not "01" == "01.1" + "01.2"
print(d)
# Usage as a valiation rule is as follows
rules <- validator(hierarchy(volume, labels = nace, hierarchy=validate::nace_rev_2[3:4]))
confront(d, rules)
# you can also pass a hierarchy as a reference, for example.
rules <- validator(hierarchy(volume, labels = nace, hierarchy=ref$nacecodes))
out <- confront(d, rules, ref=list(nacecodes=nace_rev2[3:4]))
summary(out)
# set a output to NA when a code does not occur in the code list.
d <- data.frame(
nace = c("01","01.1","01.11","01.12", "01.2", "foo")
, volume = c(100 ,70 , 30 ,40 , 25 , 60)
)
d$nacecheck <- hierarchy(d$volume, labels = d$nace, hierarchy=nace_rev2[3:4]
, na_value = NA)
# we have that "01.1" == "01.11" + "01.12", but not "01" == "01.1" + "01.2"
print(d)