Hierarchies2ModelMatrix {SSBtools} | R Documentation |
Model matrix representing crossed hierarchies
Description
Make a model matrix, x, that corresponds to data and represents all hierarchies crossed.
This means that aggregates corresponding to numerical variables can be computed as
t(x) %*% y
, where y
is a matrix with one column for each numerical variable.
Usage
Hierarchies2ModelMatrix(
data,
hierarchies,
inputInOutput = TRUE,
crossTable = FALSE,
total = "Total",
hierarchyVarNames = c(mapsFrom = "mapsFrom", mapsTo = "mapsTo", sign = "sign", level =
"level"),
unionComplement = FALSE,
reOrder = TRUE,
select = NULL,
removeEmpty = FALSE,
selectionByMultiplicationLimit = 10^7,
makeColnames = TRUE,
verbose = FALSE,
...
)
Arguments
data |
Matrix or data frame with data containing codes of relevant variables |
hierarchies |
List of hierarchies, which can be converted by |
inputInOutput |
Logical vector (possibly recycled) for each element of hierarchies.
TRUE means that codes from input are included in output. Values corresponding to |
crossTable |
Cross table in output when TRUE |
total |
See |
hierarchyVarNames |
Variable names in the hierarchy tables as in |
unionComplement |
Logical vector (possibly recycled) for each element of hierarchies.
When TRUE, sign means union and complement instead of addition or subtraction.
Values corresponding to |
reOrder |
When TRUE (default) output codes are ordered in a way similar to a usual model matrix ordering. |
select |
Data frame specifying variable combinations for output or a named list specifying code selections for each variable (see details). |
removeEmpty |
When TRUE and when |
selectionByMultiplicationLimit |
With non-NULL |
makeColnames |
Colnames included when TRUE (default). |
verbose |
Whether to print information during calculations. FALSE is default. |
... |
Extra unused parameters |
Details
This function makes use of AutoHierarchies
and HierarchyCompute
via HierarchyComputeDummy
.
Since the dummy matrix is transposed in comparison to HierarchyCompute
, the parameter rowSelect
is renamed to select
and makeRownames
is renamed to makeColnames
.
The select parameter as a list can be partially specified in the sense that not all hierarchy names have to be included.
The parameter inputInOutput
will only apply to hierarchies that are not in the select
list (see note).
Value
A sparse model matrix or a list of two elements (model matrix and cross table)
Note
The select
as a list is run via a special coding of the inputInOutput
parameter.
This parameter is converted into a list (as.list
) and select
elements are inserted into this list.
This is also an additional option for users of the function.
Author(s)
Øyvind Langsrud
See Also
ModelMatrix
, HierarchiesAndFormula2ModelMatrix
Examples
# Create some input
z <- SSBtoolsData("sprt_emp_withEU")
ageHier <- SSBtoolsData("sprt_emp_ageHier")
geoDimList <- FindDimLists(z[, c("geo", "eu")], total = "Europe")[[1]]
# First example has list output
Hierarchies2ModelMatrix(z, list(age = ageHier, geo = geoDimList), inputInOutput = FALSE,
crossTable = TRUE)
m1 <- Hierarchies2ModelMatrix(z, list(age = ageHier, geo = geoDimList), inputInOutput = FALSE)
m2 <- Hierarchies2ModelMatrix(z, list(age = ageHier, geo = geoDimList))
m3 <- Hierarchies2ModelMatrix(z, list(age = ageHier, geo = geoDimList, year = ""),
inputInOutput = FALSE)
m4 <- Hierarchies2ModelMatrix(z, list(age = ageHier, geo = geoDimList, year = "allYears"),
inputInOutput = c(FALSE, FALSE, TRUE))
# Illustrate the effect of unionComplement, geoHier2 as in the examples of HierarchyCompute
geoHier2 <- rbind(data.frame(mapsFrom = c("EU", "Spain"), mapsTo = "EUandSpain", sign = 1),
SSBtoolsData("sprt_emp_geoHier")[, -4])
m5 <- Hierarchies2ModelMatrix(z, list(age = ageHier, geo = geoHier2, year = "allYears"),
inputInOutput = FALSE) # Spain is counted twice
m6 <- Hierarchies2ModelMatrix(z, list(age = ageHier, geo = geoHier2, year = "allYears"),
inputInOutput = FALSE, unionComplement = TRUE)
# Compute aggregates
ths_per <- as.matrix(z[, "ths_per", drop = FALSE]) # matrix with the values to be aggregated
t(m1) %*% ths_per # crossprod(m1, ths_per) is equivalent and faster
t(m2) %*% ths_per
t(m3) %*% ths_per
t(m4) %*% ths_per
t(m5) %*% ths_per
t(m6) %*% ths_per
# Example using the select parameter as a data frame
select <- data.frame(age = c("Y15-64", "Y15-29", "Y30-64"), geo = c("EU", "nonEU", "Spain"))
m2a <- Hierarchies2ModelMatrix(z, list(age = ageHier, geo = geoDimList), select = select)
# Same result by slower alternative
m2B <- Hierarchies2ModelMatrix(z, list(age = ageHier, geo = geoDimList), crossTable = TRUE)
m2b <- m2B$modelMatrix[, Match(select, m2B$crossTable), drop = FALSE]
t(m2b) %*% ths_per
# Examples using the select parameter as a list
Hierarchies2ModelMatrix(z, list(age = ageHier, geo = geoDimList),
inputInOutput = FALSE,
select = list(geo = c("nonEU", "Portugal")))
Hierarchies2ModelMatrix(z, list(age = ageHier, geo = geoDimList),
select = list(geo = c("nonEU", "Portugal"), age = c("Y15-64", "Y15-29")))