dcc {distrr}R Documentation

Data cube creation (dcc)

Description

Data cube creation (dcc)

Usage

dcc(.data, .variables, .fun = jointfun_, ...)

dcc2(.data, .variables, .fun = jointfun_, order_type = extract_unique2, ...)

dcc5(
  .data,
  .variables,
  .fun = jointfun_,
  .total = "Totale",
  order_type = extract_unique4,
  .all = TRUE,
  ...
)

Arguments

.data

data frame to be processed

.variables

variables to split data frame by, as a character vector (c("var1", "var2")).

.fun

function to apply to each piece (default: jointfun_)

...

additional functions passed to .fun.

order_type

a function like extract_unique or extract_unique2.

.total

character string with the name to give to the subset of data that includes all the observations of a variable (default: "Totale").

.all

logical, indicating if functions' have to be evaluated on the complete dataset.

Value

a data cube, with a column for each cateogorical variable used, and a row for each combination of all the categorical variables' modalities. In addition to all the modalities, each variable will also have a "Total" possibility, which includes all the others. The data cube will contain marginal, conditional and joint empirical distributions...

Examples

data("invented_wages")
str(invented_wages)
tmp <- dcc(.data = invented_wages, 
           .variables = c("gender", "sector"), .fun = jointfun_)
tmp
str(tmp)
tmp2 <- dcc2(.data = invented_wages, 
            .variables = c("gender", "education"), 
            .fun = jointfun_, 
            order_type = extract_unique2)
tmp2
str(tmp2)

# dcc5 works like dcc2, but has an additional optional argument, .total,
# that can be added to give a name to the groups that include all the 
# observations of a variable.
tmp5 <- dcc5(.data = invented_wages, 
            .variables = c("gender", "education"),
            .fun = jointfun_,
            .total = "TOTAL",
            order_type = extract_unique2)
tmp5


[Package distrr version 0.0.6 Index]