GaussSuppressionFromData {GaussSuppression}  R Documentation 
Cell suppression from input data containing inner cells
Description
Aggregates are generated followed by
primary suppression followed by
secondary suppression by Gaussian elimination by GaussSuppression
Usage
GaussSuppressionFromData(
data,
dimVar = NULL,
freqVar = NULL,
...,
numVar = NULL,
weightVar = NULL,
charVar = NULL,
hierarchies = NULL,
formula = NULL,
maxN = suppressWarnings(formals(c(primary)[[1]])$maxN),
protectZeros = suppressWarnings(formals(c(primary)[[1]])$protectZeros),
secondaryZeros = suppressWarnings(formals(candidates)$secondaryZeros),
candidates = CandidatesDefault,
primary = PrimaryDefault,
forced = NULL,
hidden = NULL,
singleton = SingletonDefault,
singletonMethod = ifelse(secondaryZeros, "anySumNOTprimary", "anySum"),
printInc = TRUE,
output = "publish",
x = NULL,
crossTable = NULL,
preAggregate = is.null(freqVar),
extraAggregate = preAggregate & !is.null(charVar),
structuralEmpty = FALSE,
extend0 = FALSE,
spec = NULL,
specLock = FALSE,
freqVarNew = rev(make.unique(c(names(data), "freq")))[1],
nUniqueVar = rev(make.unique(c(names(data), "nUnique")))[1],
forcedInOutput = "ifNonNULL",
unsafeInOutput = "ifForcedInOutput",
lpPackage = NULL
)
Arguments
data 
Input data as a data frame 
dimVar 
The main dimensional variables and additional aggregating variables. This parameter can be useful when hierarchies and formula are unspecified. 
freqVar 
A single variable holding counts (name or number). 
... 
Further arguments to be passed to the supplied functions and to 
numVar 
Other numerical variables to be aggregated 
weightVar 
weightVar Weights (costs) to be used to order candidates for secondary suppression 
charVar 
Other variables possibly to be used within the supplied functions 
hierarchies 
List of hierarchies, which can be converted by 
formula 
A model formula 
maxN 
Suppression parameter. Cells with frequency 
protectZeros 
Suppression parameter.
When 
secondaryZeros 
Suppression parameter.
When 
candidates 
GaussSuppression input or a function generating it (see details) Default: 
primary 
GaussSuppression input or a function generating it (see details) Default: 
forced 
GaussSuppression input or a function generating it (see details) 
GaussSuppression input or a function generating it (see details)  
singleton 
GaussSuppression input or a function generating it (see details) Default: 
singletonMethod 

printInc 

output 
One of 
x 

crossTable 
See above. 
preAggregate 
When 
extraAggregate 
When 
structuralEmpty 
When 
extend0 
Data is automatically extended by 
spec 

specLock 
When 
freqVarNew 
Name of new frequency variable generated when input 
nUniqueVar 
Name of variable holding the number of unique contributors.
This variable will be generated in the 
forcedInOutput 
Whether to include 
unsafeInOutput 
Whether to include 
lpPackage 

Details
The supplied functions for generating GaussSuppression
input takes the following arguments:
crossTable
, x
, freq
, num
, weight
, maxN
, protectZeros
, secondaryZeros
, data
, freqVar
, numVar
, weightVar
, charVar
, dimVar
and ...
.
where the two first are ModelMatrix
outputs (modelMatrix
renamed to x
).
The vector, freq
, is aggregated counts (t(x) %*% data[[freqVar]]
).
In addition, the supplied singleton
function also takes nUniqueVar
and (output from) primary
as input.
Similarly, num
, is a data frame of aggregated numerical variables.
It is possible to supply several primary functions joined by c
, e.g. (c(FunPrim1, FunPrim2)
).
All NA
s returned from any of the functions force the corresponding cells not to be primary suppressed.
The effect of maxN
, protectZeros
and secondaryZeros
depends on the supplied functions where these parameters are used.
Their default values are inherited from the default values of the first primary
function (several possible) or,
in the case of secondaryZeros
, the candidates
function.
When defaults cannot be inherited, they are set to NULL
.
In practice the function formals
are still used to generate the defaults when primary
and/or candidates
are not functions.
Then NULL
is correctly returned, but suppressWarnings
are needed.
Singleton handling can be turned off by singleton = NULL
or singletonMethod = "none"
.
Both of these choices are identical in the sense that singletonMethod
is set to "none"
whenever singleton
is NULL
and vice versa.
Information about uncertain primary suppressions due to forced cells can be found
as described by parameters unsafeInOutput
and output
(= "all"
).
When forced cells affect singleton problems, this is not implemented.
Some information can be seen from warnings.
This can also be seen by choosing output = "secondary"
together
with unsafeInOutput = "ifany"
or unsafeInOutput = "always"
.
Then, negative indices from GaussSuppression
using
unsafeAsNegative = TRUE
will be included in the output.
Singleton problems may, however, be present even if it cannot be seen as warning/output.
In some cases, the problems can be detected by GaussSuppressDec
.
In some cases, cells that are forced, hidden, or primary suppressed can overlap.
For these situations, forced has precedence over hidden and primary.
That is, if a cell is both forced and hidden, it will be treated as a forced cell and thus published.
Similarly, any primary suppression of a forced cell will be ignored
(see parameter whenPrimaryForced
to GaussSuppression
).
It is, however, meaningful to combine primary and hidden.
Such cells will be protected while also being assigned the NA
value in the suppressed
output variable.
Value
Aggregated data with suppression information
Author(s)
Ã˜yvind Langsrud and Daniel Lupp
Examples
z1 < SSBtoolsData("z1")
GaussSuppressionFromData(z1, 1:2, 3)
z2 < SSBtoolsData("z2")
GaussSuppressionFromData(z2, 1:4, 5, protectZeros = FALSE)
# Data as in GaussSuppression examples
df < data.frame(values = c(1, 1, 1, 5, 5, 9, 9, 9, 9, 9, 0, 0, 0, 7, 7),
var1 = rep(1:3, each = 5), var2 = c("A", "B", "C", "D", "E"))
GaussSuppressionFromData(df, c("var1", "var2"), "values")
GaussSuppressionFromData(df, c("var1", "var2"), "values", formula = ~var1 + var2, maxN = 10)
GaussSuppressionFromData(df, c("var1", "var2"), "values", formula = ~var1 + var2, maxN = 10,
protectZeros = TRUE, # Parameter needed by SingletonDefault and default not in primary
primary = function(freq, crossTable, maxN, ...)
which(freq <= maxN & crossTable[[2]] != "A" & crossTable[, 2] != "C"))
# Combining several primary functions
# Note that NA & c(TRUE, FALSE) equals c(NA, FALSE)
GaussSuppressionFromData(df, c("var1", "var2"), "values", formula = ~var1 + var2, maxN = 10,
primary = c(function(freq, maxN, protectZeros = TRUE, ...) freq >= 45,
function(freq, maxN, ...) freq <= maxN,
function(crossTable, ...) NA & crossTable[[2]] == "C",
function(crossTable, ...) NA & crossTable[[1]]== "Total"
& crossTable[[2]]== "Total"))
# Similar to GaussSuppression examples
GaussSuppressionFromData(df, c("var1", "var2"), "values", formula = ~var1 * var2,
candidates = NULL, singleton = NULL, protectZeros = FALSE, secondaryZeros = TRUE)
GaussSuppressionFromData(df, c("var1", "var2"), "values", formula = ~var1 * var2,
singleton = NULL, protectZeros = FALSE, secondaryZeros = FALSE)
GaussSuppressionFromData(df, c("var1", "var2"), "values", formula = ~var1 * var2,
protectZeros = FALSE, secondaryZeros = FALSE)
# Examples with zeros as singletons
z < data.frame(row = rep(1:3, each = 3), col = 1:3, freq = c(0, 2, 5, 0, 0, 6:9))
GaussSuppressionFromData(z, 1:2, 3, singleton = NULL)
GaussSuppressionFromData(z, 1:2, 3, singletonMethod = "none") # as above
GaussSuppressionFromData(z, 1:2, 3)
GaussSuppressionFromData(z, 1:2, 3, protectZeros = FALSE, secondaryZeros = TRUE, singleton = NULL)
GaussSuppressionFromData(z, 1:2, 3, protectZeros = FALSE, secondaryZeros = TRUE)