R: Strip a level of headers from a pivot table

behead {unpivotr}

R Documentation

Strip a level of headers from a pivot table

Description

behead() takes one level of headers from a pivot table and makes it part of the data. Think of it like tidyr::gather(), except that it works when there is more than one row of headers (or more than one column of row-headers), and it only works on tables that have first come through as_cells() or tidyxl::xlsx_cells().

Usage

behead(
  cells,
  direction,
  name,
  values = NULL,
  types = data_type,
  formatters = list(),
  drop_na = TRUE
)

behead_if(
  cells,
  ...,
  direction,
  name,
  values = NULL,
  types = data_type,
  formatters = list(),
  drop_na = TRUE
)

Arguments

`cells`	Data frame. The cells of a pivot table, usually the output of `as_cells()` or `tidyxl::xlsx_cells()`, or of a subsequent operation on those outputs.
`direction`	The direction between a data cell and its header, one of `"up"`, `"right"`, `"down"`, `"left"`, `"up-left"`, `"up-right"`, `"right-up"`, `"right-down"`, `"down-right"`, `"down-left"`, `"left-down"`, `"left-up"`. See `?direction`. `"up-ish"`, `"down-ish"`, `"left-ish"` and `"right-ish"` aren't available because they require certain ambiguities that are better handled by using `enhead()` directly rather than via `behead()`.
`name`	A name to give the new column that will be created, e.g. `"location"` if the headers are locations. Quoted (`"location"`, not `location`) because it doesn't refer to an actual object.
`values`	Optional. The column of `cells` to use as the values of each header. Given as a bare variable name. If omitted (the default), the `types` argument will be used instead.
`types`	The name of the column that names the data type of each cell. Usually called `data_types` (the default), this is a character column that names the other columns in `cells` that contain the values of each cell. E.g. a cell with a character value will have `"character"` in this column. Unquoted(`data_types`, not `"data_types"`) because it refers to an actual object.
`formatters`	A named list of functions for formatting each data type in a set of headers of mixed data types, e.g. when some headers are dates and others are characters. These can be given as `character = toupper` or `character = ~ toupper(.x)`, similar to purrr::map.
`drop_na`	logical Whether to filter out headers that have `NA` in the `value` column. Default: `TRUE`. This can happen with the output of `tidyxl::xlsx_cells()`, when an empty cell exists because it has formatting applied to it, but should be ignored.
`...`	Passed to dplyr::filter. logical predicates defined in terms of the variables in `.data`. Multiple conditions are combined with `&`. Only rows where the condition evaluates to `TRUE` are kept. The arguments in `...` are automatically quoted and evaluated in the context of the data frame. They support unquoting and splicing. See the dplyr `vignette("programming")` for an introduction to these concepts.

Value

A data frame

Examples

# A simple table with a row of headers
(x <- data.frame(a = 1:2, b = 3:4))

# Make a tidy representation of each cell
(cells <- as_cells(x, col_names = TRUE))

# Strip the cells in row 1 (the original headers) and use them as data
behead(cells, "N", foo)

# More complex example: pivot table with several layers of headers
(x <- purpose$`up-left left-up`)

# Make a tidy representation
cells <- as_cells(x)
head(cells)
tail(cells)

# Strip the headers and make them into data
tidy <-
  cells %>%
  behead("up-left", Sex) %>%
  behead("up", `Sense of purpose`) %>%
  behead("left-up", `Highest qualification`) %>%
  behead("left", `Age group (Life-stages)`) %>%
  dplyr::mutate(count = as.integer(chr)) %>%
  dplyr::select(-row, -col, -data_type, -chr)
head(tidy)

# Check against the provided 'tidy' version of the data.
dplyr::anti_join(tidy, purpose$Tidy)

# The provided 'tidy' data is missing a row for Male 15-24-year-olds with a
# postgraduate qualification and a sense of purpose between 0 and 6.  That
# seems to have been an oversight by Statistics New Zealand.

cells <- tibble::tribble(
       ~X1, ~adult, ~juvenile,
    "LION",    855,       677,
    "male",    496,       322,
  "female",    359,       355,
   "TIGER",    690,       324,
    "male",    381,       222,
  "female",    309,       102
  )
cells <- as_cells(cells, col_names = TRUE)

cells %>%
  behead_if(chr == toupper(chr), direction = "left-up", name = "species") %>%
  behead("left", "sex") %>%
  behead("up", "age") %>%
  dplyr::select(species, sex, age, population = dbl)

[Package unpivotr version 0.6.3 Index]