collapse_to_rvec {rvec}R Documentation

Convert a Data Frame Between 'Database' and 'Rvec' Formats

Description

collapse_to_rvec() converts a data frame from a 'database' format to an 'rvec' format. expand_from_rvec(), does the opposite, converting a data frame from an rvecs format to a database format.

Usage

collapse_to_rvec(data, draw = draw, values = value, by = NULL, type = NULL)

## S3 method for class 'data.frame'
collapse_to_rvec(data, draw = draw, values = value, by = NULL, type = NULL)

## S3 method for class 'grouped_df'
collapse_to_rvec(data, draw = draw, values = value, by = NULL, type = NULL)

expand_from_rvec(data, draw = "draw")

## S3 method for class 'data.frame'
expand_from_rvec(data, draw = "draw")

## S3 method for class 'grouped_df'
expand_from_rvec(data, draw = "draw")

Arguments

data

A data frame, possibly grouped.

draw

<tidyselect> The variable that uniquely identifies random draws within each combination of values for the 'by' variables. Must be quoted for expand_from_rvec().

values

<tidyselect> One or more variables in data that hold measurements.

by

<tidyselect> Variables used to stratify or cross-classify the data. See Details.

type

String specifying the class of rvec to use for each variable. Optional. See Details.

Details

In database format, each row represents one random draw. The data frame contains a 'draw' variable that distinguishes different draws within the same combination of 'by' variables. In rvec format, each row represents one combination of 'by' variables, and multiple draws are stored in an rvec. See below for examples.

Value

A data frame.

by argument

The by argument is used to specify stratifying variables. For instance if by includes sex and age, then data frame produced by collapse_to_rvec() has separate rows for each combination of sex and age.

If data is a grouped data frame, then the grouping variables take precedence over by.

If no value for by is provided, and data is not a grouped data frame, then collapse_to_rvec() assumes that all variables in data that are not included in value and draw should be included in by.

type argument

By default, collapse_to_rvec() calls function rvec() on each values variable in data. rvec() chooses the class of the output (ie rvec_chr, rvec_dbl, rvec_int, or rvec_lgl) depending on the input. Types can instead be specified in advance, using the type argument. type is a string, each character of which specifies the class of the corresponding values variable. The characters have the following meanings:

The codes for type are modified from ones used by the readr package.

See Also

collapse_to_rvec() and expand_from_rvec() are analogous to tidyr::nest() and tidyr::unnest() though collapse_to_rvec() and expand_from_rvec() move values into and out of rvecs, while tidyr::nest() and tidyr::unnest() move them in and out of data frames. (tidyr::nest() and tidyr::unnest() are also a lot more flexible.)

Examples

library(dplyr)
data_db <- tribble(
  ~occupation,    ~sim, ~pay,
  "Statistician", 1,    100,
  "Statistician", 2,    80,
  "Statistician", 3,    105,
  "Banker",       1,    400,
  "Banker",       2,    350,
  "Banker",       3,    420
)

## database format to rvec format
data_rv <- data_db |>
  collapse_to_rvec(draw = sim,
                   values = pay)
data_rv

## rvec format to database format
data_rv |>
  expand_from_rvec()

## provide a name for the draw variable
data_rv |>
  expand_from_rvec(draw = "sim")

## specify that rvec variable
## must be rvec_int
data_rv <- data_db |>
  collapse_to_rvec(draw = sim,
                   values = pay,
                   type = "i")

## specify stratifying variable explicitly,
## using 'by' argument
data_db |>
  collapse_to_rvec(draw = sim,
                   values = pay,
                   by = occupation)

## specify stratifying variable explicitly,
## using 'group_by'
library(dplyr)
data_db |>
  group_by(occupation) |>
  collapse_to_rvec(draw = sim,
                   values = pay)

[Package rvec version 0.0.6 Index]