GaussSuppressionTwoWay {GaussSuppression}R Documentation

Two-way iteration variant of GaussSuppressionFromData

Description

Internally, data is organized in a two-way table.

Use parameter colVar to choose hierarchies for columns (others will be rows). Iterations start by column by column suppression. The algorithm utilizes HierarchyCompute2.

With two-way iterations, larger data can be handled, but there is a residual risk. The method is a special form of linked-table iteration. Separately, the rows and columns are protected by GaussSuppression and they have common suppressed cells.

Usage

GaussSuppressionTwoWay(
  data,
  dimVar = NULL,
  freqVar = NULL,
  numVar = NULL,
  weightVar = NULL,
  charVar = NULL,
  hierarchies,
  formula = NULL,
  maxN = suppressWarnings(formals(c(primary)[[1]])$maxN),
  protectZeros = suppressWarnings(formals(c(primary)[[1]])$protectZeros),
  secondaryZeros = suppressWarnings(formals(candidates)$secondaryZeros),
  candidates = CandidatesDefault,
  primary = PrimaryDefault,
  forced = NULL,
  hidden = NULL,
  singleton = SingletonDefault,
  singletonMethod = ifelse(secondaryZeros, "anySumNOTprimary", "anySum"),
  printInc = TRUE,
  output = "publish",
  preAggregate = is.null(freqVar),
  colVar = names(hierarchies)[1],
  removeEmpty = TRUE,
  inputInOutput = TRUE,
  candidatesFromTotal = TRUE,
  structuralEmpty = FALSE,
  freqVarNew = rev(make.unique(c(names(data), "freq")))[1],
  ...
)

Arguments

data

Input data as a data frame

dimVar

The main dimensional variables and additional aggregating variables. This parameter can be useful when hierarchies and formula are unspecified.

freqVar

A single variable holding counts (name or number).

numVar

Other numerical variables to be aggregated

weightVar

weightVar Weights (costs) to be used to order candidates for secondary suppression

charVar

Other variables possibly to be used within the supplied functions

hierarchies

List of hierarchies, which can be converted by AutoHierarchies. Thus, the variables can also be coded by "rowFactor" or "", which correspond to using the categories in the data.

formula

A model formula

maxN

Suppression parameter. See GaussSuppressionFromData.

protectZeros

Suppression parameter. See GaussSuppressionFromData.

secondaryZeros

Suppression parameter. See GaussSuppressionFromData.

candidates

GaussSuppression input or a function generating it (see details) Default: CandidatesDefault

primary

GaussSuppression input or a function generating it (see details) Default: PrimaryDefault

forced

GaussSuppression input or a function generating it (see details)

hidden

GaussSuppression input or a function generating it (see details)

singleton

NULL or a function generating GaussSuppression input (logical vector not possible) Default: SingletonDefault

singletonMethod

GaussSuppression input

printInc

GaussSuppression input

output

One of "publish" (default), "inner". Here "inner" means input data (possibly pre-aggregated).

preAggregate

When TRUE, the data will be aggregated within the function to an appropriate level. This is defined by the dimensional variables according to dimVar, hierarchies or formula and in addition charVar.

colVar

Hierarchy variables for the column groups (others in row group).

removeEmpty

When TRUE (default) empty output corresponding to empty input is removed. When NULL, removal only within the algorithm (x matrices) so that such empty outputs are never secondary suppressed.

inputInOutput

Logical vector (possibly recycled) for each element of hierarchies. TRUE means that codes from input are included in output. Values corresponding to "rowFactor" or "" are ignored.

candidatesFromTotal

When TRUE (default), same candidates for all rows and for all columns, computed from row/column totals.

structuralEmpty

See GaussSuppressionFromData.

freqVarNew

Name of new frequency variable generated when input freqVar is NULL and preAggregate is TRUE. Default is "freq" provided this is not found in names(data).

...

Further arguments to be passed to the supplied functions.

Details

The supplied functions for generating GaussSuppression input behave as in GaussSuppressionFromData with some exceptions. When candidatesFromTotal is TRUE (default) the candidate function will be run locally once for rows and once for columns. Each time based on column or row totals. The global x-matrix will only be generated if one of the functions supplied needs it. Non-NULL singleton can only be supplied as a function. This function will be run locally within the algorithm before each call to GaussSuppression.

Note that a difference from GaussSuppressionFromData is that parameter removeEmpty is set to TRUE by default.

Another difference is that duplicated combinations is not allowed. Normally duplicates are avoided by setting preAggregate to TRUE. When the charVar parameter is used, this can still be a problem. See the examples for a possible workaround.

Value

Aggregated data with suppression information

Examples

z3 <- SSBtoolsData("z3")

dimListsA <- SSBtools::FindDimLists(z3[, 1:6])
dimListsB <- SSBtools::FindDimLists(z3[, c(1, 4, 5)])

set.seed(123)
z <- z3[sample(nrow(z3),250),]

## Not run: 
out1 <- GaussSuppressionTwoWay(z, freqVar = "ant", hierarchies = dimListsA, 
                               colVar = c("hovedint"))

## End(Not run)                                
out2 <- GaussSuppressionTwoWay(z, freqVar = "ant", hierarchies = dimListsA, 
                               colVar = c("hovedint", "mnd"))
out3 <- GaussSuppressionTwoWay(z, freqVar = "ant", hierarchies = dimListsB, 
                               colVar = c("region"))
out4 <- GaussSuppressionTwoWay(z, freqVar = "ant", hierarchies = dimListsB, 
                               colVar = c("hovedint", "region"))
                               
# "mnd" not in  hierarchies -> duplicated combinations in input 
# Error when  preAggregate is FALSE: Index method failed. Duplicated combinations?
out5 <- GaussSuppressionTwoWay(z, freqVar = "ant", hierarchies = dimListsA[1:3], 
                               protectZeros = FALSE, colVar = c("hovedint"), preAggregate = TRUE)


# charVar needed -> Still problem when preAggregate is TRUE
# Possible workaround by extra hierarchy 
out6 <- GaussSuppressionTwoWay(z, freqVar = "ant", charVar = "mnd2",
                               hierarchies = c(dimListsA[1:3], mnd2 = "Total"), # include charVar 
                               inputInOutput = c(TRUE, TRUE, FALSE),  # FALSE -> only Total 
                               protectZeros = FALSE, colVar = c("hovedint"),
                               preAggregate = TRUE,  
                               hidden = function(x, data, charVar, ...) 
                                 as.vector((t(x) %*% as.numeric(data[[charVar]] == "M06M12")) == 0))                                

[Package GaussSuppression version 0.8.8 Index]