.find {tabshiftr} | R Documentation |
Determine row or column on the fly
Description
Find the location of a variable not based on it's columns/rows, but based on a regular expression or function
Usage
.find(
fun = NULL,
pattern = NULL,
col = NULL,
row = NULL,
invert = FALSE,
relative = FALSE
)
Arguments
fun |
[ |
pattern |
[ |
col |
[ |
row |
[ |
invert |
[ |
relative |
[ |
Details
This functions is basically a wild-card for when columns or rows are not known ad-hoc, but have to be assigned on the fly. This can be very helpful when several tables contain the same variables, but the arrangement may be slightly different.
Value
the index values where the target was found.
How does this work
The first step in using any schema is validating
it via the function validateSchema
. This happens by default
in reorganise
, but can also be done manually, for example
when debugging complicated schema descriptions.
In case that function encounters a schema that wants to find columns or
rows on the fly via .find
, it combines all cells of columns and all
cells of rows into one character string and matches the regular expression
or function on those. Columns/rows that have a match are returned as the
respective column/row value.
Examples
# use regular expressions to find cell positions
(input <- tabs2shift$clusters_messy)
schema <- setCluster(id = "territories",
left = .find(pattern = "comm*"), top = .find(pattern = "comm*")) %>%
setIDVar(name = "territories", columns = c(1, 1, 4), rows = c(2, 9, 9)) %>%
setIDVar(name = "year", columns = 4, rows = c(3:6), distinct = TRUE) %>%
setIDVar(name = "commodities", columns = c(1, 1, 4)) %>%
setObsVar(name = "harvested", columns = c(2, 2, 5)) %>%
setObsVar(name = "production", columns = c(3, 3, 6))
schema
validateSchema(schema = schema, input = input)
# use a function to find rows
(input <- tabs2shift$messy_rows)
schema <-
setFilter(rows = .find(fun = is.numeric, col = 1, invert = TRUE)) %>%
setIDVar(name = "territories", columns = 1) %>%
setIDVar(name = "year", columns = 2) %>%
setIDVar(name = "commodities", columns = 3) %>%
setObsVar(name = "harvested", columns = 5) %>%
setObsVar(name = "production", columns = 6)
reorganise(schema = schema, input = input)