histogramDP {DPpack} | R Documentation |
Differentially Private Histogram
Description
This function computes a differentially private histogram from a vector at user-specified privacy levels of epsilon and delta. A histogram object is returned with sanitized values for the counts for easy plotting.
Usage
histogramDP(
x,
eps,
breaks = "Sturges",
normalize = FALSE,
which.sensitivity = "bounded",
mechanism = "Laplace",
delta = 0,
type.DP = "aDP",
allow.negative = FALSE
)
Arguments
x |
Numeric vector from which the histogram will be formed. |
eps |
Positive real number defining the epsilon privacy budget. |
breaks |
Identical to the argument with the same name from
|
normalize |
Logical value. If FALSE (default), returned histogram counts correspond to frequencies. If TRUE, returned histogram counts correspond to densities (i.e. area of histogram is one). |
which.sensitivity |
String indicating which type of sensitivity to use. Can be one of 'bounded', 'unbounded', 'both'. If 'bounded' (default), returns result based on bounded definition for differential privacy. If 'unbounded', returns result based on unbounded definition. If 'both', returns result based on both methods (Kifer and Machanavajjhala 2011). Note that if 'both' is chosen, each result individually satisfies (eps, delta)-differential privacy, but may not do so collectively and in composition. Care must be taken not to violate differential privacy in this case. |
mechanism |
String indicating which mechanism to use for differential
privacy. Currently the following mechanisms are supported: 'Laplace',
'Gaussian'. Default is Laplace. See |
delta |
Nonnegative real number defining the delta privacy parameter. If 0 (default), reduces to eps-DP and the Laplace mechanism is used. |
type.DP |
String indicating the type of differential privacy desired for the Gaussian mechanism (if selected). Can be either 'pDP' for probabilistic DP (Machanavajjhala et al. 2008) or 'aDP' for approximate DP (Dwork et al. 2006). Note that if 'aDP' is chosen, epsilon must be strictly less than 1. |
allow.negative |
Logical value. If FALSE (default), any negative values in the sanitized histogram due to the added noise will be set to 0. If TRUE, the negative values (if any) will be returned. |
Value
Sanitized histogram based on the bounded and/or unbounded definitions of differential privacy.
References
Dwork C, McSherry F, Nissim K, Smith A (2006). “Calibrating Noise to Sensitivity in Private Data Analysis.” In Halevi S, Rabin T (eds.), Theory of Cryptography, 265–284. ISBN 978-3-540-32732-5, https://doi.org/10.1007/11681878_14.
Kifer D, Machanavajjhala A (2011). “No Free Lunch in Data Privacy.” In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, SIGMOD '11, 193–204. ISBN 9781450306614, doi:10.1145/1989323.1989345.
Machanavajjhala A, Kifer D, Abowd J, Gehrke J, Vilhuber L (2008). “Privacy: Theory meets Practice on the Map.” In 2008 IEEE 24th International Conference on Data Engineering, 277-286. doi:10.1109/ICDE.2008.4497436.
Dwork C, Kenthapadi K, McSherry F, Mironov I, Naor M (2006). “Our Data, Ourselves: Privacy Via Distributed Noise Generation.” In Vaudenay S (ed.), Advances in Cryptology - EUROCRYPT 2006, 486–503. ISBN 978-3-540-34547-3, doi:10.1007/11761679_29.
Examples
x <- stats::rnorm(500)
graphics::hist(x) # Non-private histogram
result <- histogramDP(x, 1)
plot(result) # Private histogram
graphics::hist(x, freq=FALSE) # Normalized non-private histogram
result <- histogramDP(x, .5, normalize=TRUE, which.sensitivity='unbounded',
mechanism='Gaussian',delta=0.01, allow.negative=TRUE)
plot(result) # Normalized private histogram (note negative values allowed)