GaussSuppressionTwoWay {GaussSuppression} | R Documentation |
Two-way iteration variant of GaussSuppressionFromData
Description
Internally, data is organized in a two-way table.
Use parameter colVar
to choose hierarchies for columns (others will be rows). Iterations start by column by column suppression.
The algorithm utilizes HierarchyCompute2
.
With two-way iterations, larger data can be handled, but there is a residual risk.
The method is a special form of linked-table iteration.
Separately, the rows and columns are protected by GaussSuppression
and they have common suppressed cells.
Usage
GaussSuppressionTwoWay(
data,
dimVar = NULL,
freqVar = NULL,
numVar = NULL,
weightVar = NULL,
charVar = NULL,
hierarchies,
formula = NULL,
maxN = suppressWarnings(formals(c(primary)[[1]])$maxN),
protectZeros = suppressWarnings(formals(c(primary)[[1]])$protectZeros),
secondaryZeros = suppressWarnings(formals(candidates)$secondaryZeros),
candidates = CandidatesDefault,
primary = PrimaryDefault,
forced = NULL,
hidden = NULL,
singleton = SingletonDefault,
singletonMethod = ifelse(secondaryZeros, "anySumNOTprimary", "anySum"),
printInc = TRUE,
output = "publish",
preAggregate = is.null(freqVar),
colVar = names(hierarchies)[1],
removeEmpty = TRUE,
inputInOutput = TRUE,
candidatesFromTotal = TRUE,
structuralEmpty = FALSE,
freqVarNew = rev(make.unique(c(names(data), "freq")))[1],
...
)
Arguments
data |
Input data as a data frame |
dimVar |
The main dimensional variables and additional aggregating variables. This parameter can be useful when hierarchies and formula are unspecified. |
freqVar |
A single variable holding counts (name or number). |
numVar |
Other numerical variables to be aggregated |
weightVar |
weightVar Weights (costs) to be used to order candidates for secondary suppression |
charVar |
Other variables possibly to be used within the supplied functions |
hierarchies |
List of hierarchies, which can be converted by |
formula |
A model formula |
maxN |
Suppression parameter. See |
protectZeros |
Suppression parameter. See |
secondaryZeros |
Suppression parameter. See |
candidates |
GaussSuppression input or a function generating it (see details) Default: |
primary |
GaussSuppression input or a function generating it (see details) Default: |
forced |
GaussSuppression input or a function generating it (see details) |
GaussSuppression input or a function generating it (see details) | |
singleton |
NULL or a function generating GaussSuppression input (logical vector not possible) Default: |
singletonMethod |
|
printInc |
|
output |
One of |
preAggregate |
When |
colVar |
Hierarchy variables for the column groups (others in row group). |
removeEmpty |
When TRUE (default) empty output corresponding to empty input is removed. When NULL, removal only within the algorithm (x matrices) so that such empty outputs are never secondary suppressed. |
inputInOutput |
Logical vector (possibly recycled) for each element of hierarchies.
TRUE means that codes from input are included in output. Values corresponding to |
candidatesFromTotal |
When TRUE (default), same candidates for all rows and for all columns, computed from row/column totals. |
structuralEmpty |
See |
freqVarNew |
Name of new frequency variable generated when input |
... |
Further arguments to be passed to the supplied functions. |
Details
The supplied functions for generating GaussSuppression
input behave as in GaussSuppressionFromData
with some exceptions.
When candidatesFromTotal
is TRUE
(default) the candidate function will be run locally once for rows and once for columns. Each time based on column or row totals.
The global x-matrix will only be generated if one of the functions supplied needs it.
Non-NULL singleton can only be supplied as a function. This function will be run locally within the algorithm before each call to GaussSuppression
.
Note that a difference from GaussSuppressionFromData
is that parameter removeEmpty
is set to TRUE
by default.
Another difference is that duplicated combinations is not allowed. Normally duplicates are avoided by setting preAggregate
to TRUE
.
When the charVar
parameter is used, this can still be a problem. See the examples for a possible workaround.
Value
Aggregated data with suppression information
Examples
z3 <- SSBtoolsData("z3")
dimListsA <- SSBtools::FindDimLists(z3[, 1:6])
dimListsB <- SSBtools::FindDimLists(z3[, c(1, 4, 5)])
set.seed(123)
z <- z3[sample(nrow(z3),250),]
## Not run:
out1 <- GaussSuppressionTwoWay(z, freqVar = "ant", hierarchies = dimListsA,
colVar = c("hovedint"))
## End(Not run)
out2 <- GaussSuppressionTwoWay(z, freqVar = "ant", hierarchies = dimListsA,
colVar = c("hovedint", "mnd"))
out3 <- GaussSuppressionTwoWay(z, freqVar = "ant", hierarchies = dimListsB,
colVar = c("region"))
out4 <- GaussSuppressionTwoWay(z, freqVar = "ant", hierarchies = dimListsB,
colVar = c("hovedint", "region"))
# "mnd" not in hierarchies -> duplicated combinations in input
# Error when preAggregate is FALSE: Index method failed. Duplicated combinations?
out5 <- GaussSuppressionTwoWay(z, freqVar = "ant", hierarchies = dimListsA[1:3],
protectZeros = FALSE, colVar = c("hovedint"), preAggregate = TRUE)
# charVar needed -> Still problem when preAggregate is TRUE
# Possible workaround by extra hierarchy
out6 <- GaussSuppressionTwoWay(z, freqVar = "ant", charVar = "mnd2",
hierarchies = c(dimListsA[1:3], mnd2 = "Total"), # include charVar
inputInOutput = c(TRUE, TRUE, FALSE), # FALSE -> only Total
protectZeros = FALSE, colVar = c("hovedint"),
preAggregate = TRUE,
hidden = function(x, data, charVar, ...)
as.vector((t(x) %*% as.numeric(data[[charVar]] == "M06M12")) == 0))