dapper_sample {dapper}R Documentation

Private Posterior Sampler

Description

Generates samples from the private posterior using a data augmentation framework.

Usage

dapper_sample(
  data_model = NULL,
  sdp = NULL,
  init_par = NULL,
  seed = NULL,
  niter = 2000,
  warmup = floor(niter/2),
  chains = 1
)

Arguments

data_model

a data model represented by a privacy class object.

sdp

the observed privatized data. Must be a vector or matrix.

init_par

initial starting point of the chain.

seed

set random seed.

niter

number of draws.

warmup

number of iterations to discard as warmup. Default is half of niter.

chains

number of MCMC chains to run. Can be done in parallel or sequentially.

Details

Generates samples from the private posterior implied by data_model. The data_model input must by an object of class privacy which is created using the new_privacy() constructor. MCMC chains can be run in parallel using furrr::future_map(). See the furrr package documentation for specifics. Long computations can be monitored with the progressr package.

Value

A dpout object which contains: *chain: a draw_matrix object containing niter - warmpup draws from the private posterior. *accept_prob: a (niter - warmup) row matrix containing acceptance probabilities. Each column corresponds to a parameter.

References

Ju, N., Awan, J. A., Gong, R., & Rao, V. A. (2022). Data Augmentation MCMC for Bayesian Inference from Privatized Data. arXiv. doi:10.48550/ARXIV.2206.00710

See Also

new_privacy()

Examples

#simulate confidential data
#privacy mechanism adds gaussian noise to each observation.
set.seed(1)
n <- 100
eps <- 3
y <- rnorm(n, mean = -2, sd = 1)
sdp <- mean(y) + rnorm(1, 0, 1/eps)

post_f <- function(dmat, theta) {
    x <- c(dmat)
    xbar <- mean(x)
    n <- length(x)
    pr_m <- 0
    pr_s2 <- 4
    ps_s2 <- 1/(1/pr_s2 + n)
    ps_m <- ps_s2 * ((1/pr_s2)*pr_m + n * xbar)
    rnorm(1, mean = ps_m, sd = sqrt(ps_s2))
}
latent_f <- function(theta) {
    matrix(rnorm(100, mean = theta, sd = 1), ncol = 1)
}
st_f <- function(xi, sdp, i) {
    mean(xi)
}
priv_f <- function(sdp, sx) {
  sum(dnorm(sdp - sx, 0, 1/eps, TRUE))
}
dmod <- new_privacy(post_f = post_f,
  latent_f = latent_f,
  priv_f = priv_f,
  st_f = st_f,
  npar = 1)

out <- dapper_sample(dmod,
                    sdp = sdp,
                    init_par = -2,
                    niter = 500)
summary(out)

# for parallel computing we 'plan' a session
# the code below uses 2 CPU cores for parallel computing
library(furrr)
plan(multisession, workers = 2)
out <- dapper_sample(dmod,
                    sdp = sdp,
                    init_par = -2,
                    niter = 500,
                    chains = 2)

# to go back to sequential computing we use
plan(sequential)

[Package dapper version 1.0.0 Index]