pack {unpivotr} | R Documentation |
Pack cell values from separate columns per data type into one list-column
Description
Pack cell values from separate columns per data type into one list-column
Usage
pack(
cells,
types = data_type,
name = "value",
drop_types = TRUE,
drop_type_cols = TRUE
)
unpack(cells, values = value, name = "data_type", drop_packed = TRUE)
Arguments
cells |
A data frame of cells, one row per cell. For |
types |
For |
name |
A string. For |
drop_types |
For |
drop_type_cols |
For |
values |
For |
drop_packed |
For |
Details
When cells are represented by rows of a data frame, the values of the cells
will be in different columns according to their data type. For example, the
value of a cell containing text will be in a column called chr
(or
character
if it came via tidyxl). A column called data_type
names, for
each cell, which column its value is in.
pack()
rearranges the cell values in a different way, so that they are all
in one column, by
taking each cell value, from whichever column.
making it an element of a list.
naming each element according to the column it came from.
making the list into a new list-column of the original data frame.
By default, the original columns are dropped, and so is the data_type
column.
unpack()
is the complement.
This can be useful for dropping all columns of cells
except the ones that
contain data. For example, tidyxl::xlsx_cells()
returns a very wide data
frame, and to make it narrow you might do:
select(cells, row, col, character, numeric, date)
But what if you don't know in advance that the data types you need are
character
, numeric
and date
? You might also need logical
and
error
.
Instead, pack()
all the data types into a single column, select it, and
then unpack.
pack(cells) %>% select(row, col, value) %>% unpack()
Functions
-
unpack()
: Unpack cell values from one list-column into separate columns per data type
Examples
# A normal data frame
w <- data.frame(foo = 1:2,
bar = c("a", "b"),
stringsAsFactors = FALSE)
w
# The same data, represented by one row per cell, with integer values in the
# `int` column and character values in the `chr` column.
x <- as_cells(w)
x
# pack() and unpack() are complements
pack(x)
unpack(pack(x))
# Drop non-data columns from a wide data frame of cells from tidyxl
if (require(tidyxl)) {
cells <- tidyxl::xlsx_cells(system.file("extdata", "purpose.xlsx", package = "unpivotr"))
cells
pack(cells) %>%
dplyr::select(row, col, value) %>%
unpack()
}