| model_aggregate {SSBtools} | R Documentation | 
Hierarchical aggregation via model specification
Description
Internally a dummy/model matrix is created according to the model specification.
This model matrix is used in the aggregation process via matrix multiplication and/or the function aggregate_multiple_fun.
Usage
model_aggregate(
  data,
  sum_vars = NULL,
  fun_vars = NULL,
  fun = NULL,
  hierarchies = NULL,
  formula = NULL,
  dim_var = NULL,
  remove_empty = NULL,
  preagg_var = NULL,
  dummy = TRUE,
  pre_aggregate = dummy,
  list_return = FALSE,
  pre_return = FALSE,
  verbose = TRUE,
  mm_args = NULL,
  ...
)
Arguments
data | 
 A data frame containing data to be aggregated  | 
sum_vars | 
 Variables to be summed. This will be done via matrix multiplication.  | 
fun_vars | 
 Variables to be aggregated by supplied functions.
This will be done via   | 
fun | 
 The   | 
hierarchies | 
 The   | 
formula | 
 The   | 
dim_var | 
 The   | 
remove_empty | 
 When non-NULL, the   | 
preagg_var | 
 Extra variables to be used as grouping elements in the pre-aggregate step  | 
dummy | 
 The   | 
pre_aggregate | 
 Whether to pre-aggregate data to reduce the dimension of the model matrix.
Note that all original   | 
list_return | 
 Whether to return a list of separate components including the model matrix   | 
pre_return | 
 Whether to return the pre-aggregate data as a two-component list. Can also be combined with   | 
verbose | 
 Whether to print information during calculations.  | 
mm_args | 
 List of further arguments passed to   | 
... | 
 Further arguments passed to   | 
Details
With formula input, limited output can be achieved by formula_selection (see example).
An attribute called startCol has been added to the output data frame to make this functionality work.
Value
A data frame or a list.
Examples
z <- SSBtoolsData("sprt_emp_withEU")
z$age[z$age == "Y15-29"] <- "young"
z$age[z$age == "Y30-64"] <- "old"
names(z)[names(z) == "ths_per"] <- "ths"
z$y <- 1:18
my_range <- function(x) c(min = min(x), max = max(x))
out <- model_aggregate(z, 
   formula = ~age:year + geo, 
   sum_vars = c("y", "ths"), 
   fun_vars = c(sum = "ths", mean = "y", med = "y", ra = "ths"), 
   fun = c(sum = sum, mean = mean, med = median, ra = my_range))
out
# Limited output can be achieved by formula_selection
formula_selection(out, ~geo)
# Using the single unnamed variable feature.
model_aggregate(z, formula = ~age, fun_vars = "y", 
                fun = c(sum = sum, mean = mean, med = median, n = length))
# To illustrate list_return and pre_return 
for (pre_return in c(FALSE, TRUE)) for (list_return in c(FALSE, TRUE)) {
  cat("\n=======================================\n")
  cat("list_return =", list_return, ", pre_return =", pre_return, "\n\n")
  out <- model_aggregate(z, formula = ~age:year, 
                         sum_vars = c("ths", "y"), 
                         fun_vars = c(mean = "y", ra = "y"), 
                         fun = c(mean = mean, ra = my_range), 
                         list_return = list_return,
                         pre_return = pre_return)
  cat("\n")
  print(out)
}
# To illustrate preagg_var 
model_aggregate(z, formula = ~age:year, 
sum_vars = c("ths", "y"), 
fun_vars = c(mean = "y", ra = "y"), 
fun = c(mean = mean, ra = my_range), 
preagg_var = "eu",
pre_return = TRUE)[["pre_data"]]
# To illustrate hierarchies 
geo_hier <- SSBtoolsData("sprt_emp_geoHier")
model_aggregate(z, hierarchies = list(age = "All", geo = geo_hier), 
                sum_vars = "y", 
                fun_vars = c(sum = "y"))
####  Special non-dummy cases illustrated below  ####
# Extend the hierarchy to make non-dummy model matrix  
geo_hier2 <- rbind(data.frame(mapsFrom = c("EU", "Spain"), 
                              mapsTo = "EUandSpain", sign = 1), geo_hier[, -4])
# Warning since non-dummy
# y and y_sum are different 
model_aggregate(z, hierarchies = list(age = "All", geo = geo_hier2), 
                sum_vars = "y", 
                fun_vars = c(sum = "y"))
# No warning since dummy since unionComplement = TRUE (see ?HierarchyCompute)
# y and y_sum are equal   
model_aggregate(z, hierarchies = list(age = "All", geo = geo_hier2), 
                sum_vars = "y", 
                fun_vars = c(sum = "y"),
                mm_args = list(unionComplement = TRUE))
# Non-dummy again, but no warning since dummy = FALSE
# Then pre_aggregate is by default set to FALSE (error when TRUE) 
# fun with extra argument needed (see ?dummy_aggregate)
# y and y_sum2 are equal
model_aggregate(z, hierarchies = list(age = "All", geo = geo_hier2), 
                sum_vars = "y", 
                fun_vars = c(sum2 = "y"),
                fun = c(sum2 = function(x, y) sum(x * y)),
                dummy = FALSE)