R: Draw fitted values from the posterior distribution

fitted_samples {gratia}

R Documentation

Draw fitted values from the posterior distribution

Description

Expectations (fitted values) of the response drawn from the posterior distribution of fitted model using a Gaussian approximation to the posterior or a simple Metropolis Hastings sampler.

Usage

fitted_samples(model, ...)

## S3 method for class 'gam'
fitted_samples(
  model,
  n = 1,
  data = newdata,
  seed = NULL,
  scale = c("response", "linear_predictor"),
  method = c("gaussian", "mh", "inla", "user"),
  n_cores = 1,
  burnin = 1000,
  thin = 1,
  t_df = 40,
  rw_scale = 0.25,
  freq = FALSE,
  unconditional = FALSE,
  draws = NULL,
  mvn_method = c("mvnfast", "mgcv"),
  ...,
  newdata = NULL,
  ncores = NULL
)

Arguments

`model`	a fitted model of the supported types
`...`	arguments passed to other methods. For `fitted_samples()`, these are passed on to `mgcv::predict.gam()`. For `posterior_samples()` these are passed on to `fitted_samples()`. For `predicted_samples()` these are passed on to the relevant `simulate()` method.
`n`	numeric; the number of posterior samples to return.
`data`	data frame; new observations at which the posterior draws from the model should be evaluated. If not supplied, the data used to fit the model will be used for `data`, if available in `model`.
`seed`	numeric; a random seed for the simulations.
`scale`	character; what scale should the fitted values be returned on? `"linear predictor"` is a synonym for `"link"` if you prefer that terminology.
`method`	character; which method should be used to draw samples from the posterior distribution. `"gaussian"` uses a Gaussian (Laplace) approximation to the posterior. `"mh"` uses a Metropolis Hastings sampler that alternates t proposals with proposals based on a shrunken version of the posterior covariance matrix. `"inla"` uses a variant of Integrated Nested Laplace Approximation due to Wood (2019), (currently not implemented). `"user"` allows for user-supplied posterior draws (currently not implemented).
`n_cores`	number of cores for generating random variables from a multivariate normal distribution. Passed to `mvnfast::rmvn()`. Parallelization will take place only if OpenMP is supported (but appears to work on Windows with current `R`).
`burnin`	numeric; number of samples to discard as the burnin draws. Only used with `method = "mh"`.
`thin`	numeric; the number of samples to skip when taking `n` draws. Results in `thin * n` draws from the posterior being taken. Only used with `method = "mh"`.
`t_df`	numeric; degrees of freedom for t distribution proposals. Only used with `method = "mh"`.
`rw_scale`	numeric; Factor by which to scale posterior covariance matrix when generating random walk proposals. Negative or non finite to skip the random walk step. Only used with `method = "mh"`.
`freq`	logical; `TRUE` to use the frequentist covariance matrix of the parameter estimators, `FALSE` to use the Bayesian posterior covariance matrix of the parameters.
`unconditional`	logical; if `TRUE` (and `freq == FALSE`) then the Bayesian smoothing parameter uncertainty corrected covariance matrix is used, if available.
`draws`	matrix; user supplied posterior draws to be used when `method = "user"`.
`mvn_method`	character; one of `"mvnfast"` or `"mgcv"`. The default is uses `mvnfast::rmvn()`, which can be considerably faster at generate large numbers of MVN random values than `mgcv::rmvn()`, but which might not work for some marginal fits, such as those where the covariance matrix is close to singular.
`newdata`	Deprecated: use `data` instead.
`ncores`	Deprecated; use `n_cores` instead. The number of cores for generating random variables from a multivariate normal distribution. Passed to `mvnfast::rmvn()`. Parallelization will take place only if OpenMP is supported (but appears to work on Windows with current `R`).

Value

A tibble (data frame) with 3 columns containing the posterior predicted values in long format. The columns are

row (integer) the row of data that each posterior draw relates to,
draw (integer) an index, in range 1:n, indicating which draw each row relates to,
response (numeric) the predicted response for the indicated row of data.

Note

Models with offset terms supplied via the offset argument to mgcv::gam() etc. are ignored by mgcv::predict.gam(). As such, this kind of offset term is also ignored by posterior_samples(). Offset terms that are included in the model formula supplied to mgcv::gam() etc are not ignored and the posterior samples produced will reflect those offset term values. This has the side effect of requiring any new data values provided to posterior_samples() via the data argument must include the offset variable.

Author(s)

Gavin L. Simpson

References

Wood, S.N., (2020). Simplified integrated nested Laplace approximation. Biometrika 107, 223–230. doi:10.1093/biomet/asz044

Examples

load_mgcv()

dat <- data_sim("eg1", n = 1000, dist = "normal", scale = 2, seed = 2)
m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML")

fs <- fitted_samples(m1, n = 5, seed = 42)

fs


# can generate own set of draws and use them
drws <- generate_draws(m1, n = 2, seed = 24)
fs2 <- fitted_samples(m1, method = "user", draws = drws)

fs2

[Package gratia version 0.9.2 Index]