R: Exact independent samples from an approximate posterior...

sample_marginal {aghq}

R Documentation

Exact independent samples from an approximate posterior distribution

Description

Draws samples from an approximate marginal distribution for general posteriors approximated using aghq, or from the mixture-of-Gaussians approximation to the variables that were marginalized over in a marginal Laplace approximation fit using aghq::marginal_laplace or aghq::marginal_laplace_tmb.

Usage

sample_marginal(
  quad,
  M,
  transformation = default_transformation(),
  interpolation = "auto",
  ...
)

## S3 method for class 'aghq'
sample_marginal(
  quad,
  M,
  transformation = quad$transformation,
  interpolation = "auto",
  ...
)

## S3 method for class 'marginallaplace'
sample_marginal(
  quad,
  M,
  transformation = quad$transformation,
  interpolation = "auto",
  ...
)

Arguments

`quad`	Object from which to draw samples. An object inheriting from class `marginallaplace` (the result of running `aghq::marginal_laplace` or `aghq::marginal_laplace_tmb`), or an object inheriting from class `aghq` (the result of running `aghq::aghq()`). Can also provide a `data.frame` returned by `aghq::compute_pdf_and_cdf` in which case samples are returned for `transparam` if `transformation` is provided, and for `param` if `transformation = NULL`.
`M`	Numeric, integer saying how many samples to draw
`transformation`	Optional. Draw samples for a transformation of the parameter whose posterior was normalized using adaptive quadrature. `transformation` is either: a) an `aghqtrans` object returned by `aghq::make_transformation`, or b) a list that will be passed to that function internally. See `?aghq::make_transformation` for details.
`interpolation`	Which method to use for interpolating the marginal posteriors (and hence to draw samples using the inverse CDF method), `'auto'` (choose for you), `'polynomial'` or `'spline'`? If `k > 3` then the polynomial may be unstable and you should use the spline, but the spline doesn't work unless `k > 3` so it's not the default. The default of `'auto'` figures this out for you. See `interpolate_marginal_posterior()`.
`...`	Used to pass additional arguments.

Details

For objects of class aghq or their marginal distribution components, sampling is done using the inverse CDF method, which is just compute_quantiles(quad$marginals[[1]],runif(M)).

For marginal Laplace approximations (aghq::marginal_laplace()): this method samples from the posterior and returns a vector that is ordered the same as the "W" variables in your marginal Laplace approximation. See Algorithm 1 in Stringer et. al. (2021, https://arxiv.org/abs/2103.07425) for the algorithm; the details of sampling from a Gaussian are described in the reference(s) therein, which makes use of the (sparse) Cholesky factors. These are computed once for each quadrature point and stored.

For the marginal Laplace approximations where the "inner" model is handled entirely by TMB (aghq::marginal_laplace_tmb), the interface here is identical to above, with the order of the "W" vector being determined by TMB. See the names of ff$env$last.par, for example (where ff is your template obtained from a call to TMB::MakeADFun.

If getOption('mc.cores',1L) > 1, the Cholesky decompositions of the Hessians are computed in parallel using parallel::mcapply, for the Gaussian approximation involved for objects of class marginallaplace. This step is slow so may be sped up by parallelization, if the matrices are sparse (and hence the operation is just slow, but not memory-intensive). Uses the parallel package so is not available on Windows.

Value

If run on a marginallaplace object, a list containing elements:

samps: d x M matrix where d = dim(W) and each column is a sample from pi(W|Y,theta)
theta: M x S tibble where S = dim(theta) containing the value of theta for each sample
thetasamples: A list of S numeric vectors each of length M where the jth element is a sample from pi(theta_{j}|Y). These are samples from the marginals, NOT the joint. Sampling from the joint is a much more difficult problem and how to do so in this context is an active area of research.

If run on an aghq object, then a list with just the thetasamples element. It still returns a list to maintain output consistency across inputs.

If, for some reason, you don't want to do the sampling from pi(theta|Y), you can manually set quad$marginals = NULL. Note that this sampling is typically very fast and so I don't know why you would need to not do it but the option is there if you like.

If, again for some reason, you just want samples from one marginal distribution using inverse CDF, you can just do compute_quantiles(quad$marginals[[1]],runif(M)).

Examples

logfteta2d <- function(eta,y) {
  # eta is now (eta1,eta2)
  # y is now (y1,y2)
  n <- length(y)
  n1 <- ceiling(n/2)
  n2 <- floor(n/2)
  y1 <- y[1:n1]
  y2 <- y[(n1+1):(n1+n2)]
  eta1 <- eta[1]
  eta2 <- eta[2]
  sum(y1) * eta1 - (length(y1) + 1) * exp(eta1) - sum(lgamma(y1+1)) + eta1 +
    sum(y2) * eta2 - (length(y2) + 1) * exp(eta2) - sum(lgamma(y2+1)) + eta2
}
set.seed(84343124)
n1 <- 5
n2 <- 5
n <- n1+n2
y1 <- rpois(n1,5)
y2 <- rpois(n2,5)
objfunc2d <- function(x) logfteta2d(x,c(y1,y2))
objfunc2dmarg <- function(W,theta) objfunc2d(c(W,theta))
objfunc2dmarggr <- function(W,theta) {
  fn <- function(W) objfunc2dmarg(W,theta)
  numDeriv::grad(fn,W)
}
objfunc2dmarghe <- function(W,theta) {
  fn <- function(W) objfunc2dmarg(W,theta)
  numDeriv::hessian(fn,W)
}

funlist2dmarg <- list(
  fn = objfunc2dmarg,
  gr = objfunc2dmarggr,
  he = objfunc2dmarghe
)

[Package aghq version 0.4.1 Index]