smooth_samples {gratia}R Documentation

Posterior draws for individual smooths

Description

Returns draws from the posterior distributions of smooth functions in a GAM. Useful, for example, for visualising the uncertainty in individual estimated functions.

Usage

smooth_samples(model, ...)

## S3 method for class 'gam'
smooth_samples(
  model,
  select = NULL,
  term = deprecated(),
  n = 1,
  data = newdata,
  method = c("gaussian", "mh", "inla", "user"),
  seed = NULL,
  freq = FALSE,
  unconditional = FALSE,
  n_cores = 1L,
  n_vals = 200,
  burnin = 1000,
  thin = 1,
  t_df = 40,
  rw_scale = 0.25,
  rng_per_smooth = FALSE,
  draws = NULL,
  partial_match = NULL,
  mvn_method = c("mvnfast", "mgcv"),
  ...,
  newdata = NULL,
  ncores = NULL
)

Arguments

model

a fitted model of the supported types

...

arguments passed to other methods. For fitted_samples(), these are passed on to mgcv::predict.gam(). For posterior_samples() these are passed on to fitted_samples(). For predicted_samples() these are passed on to the relevant simulate() method.

select

character; select which smooth's posterior to draw from. The default (NULL) means the posteriors of all smooths in model wil be sampled from. If supplied, a character vector of requested terms.

term

[Deprecated] Use select instead.

n

numeric; the number of posterior samples to return.

data

data frame; new observations at which the posterior draws from the model should be evaluated. If not supplied, the data used to fit the model will be used for data, if available in model.

method

character; which method should be used to draw samples from the posterior distribution. "gaussian" uses a Gaussian (Laplace) approximation to the posterior. "mh" uses a Metropolis Hastings sampler that alternates t proposals with proposals based on a shrunken version of the posterior covariance matrix. "inla" uses a variant of Integrated Nested Laplace Approximation due to Wood (2019), (currently not implemented). "user" allows for user-supplied posterior draws (currently not implemented).

seed

numeric; a random seed for the simulations.

freq

logical; TRUE to use the frequentist covariance matrix of the parameter estimators, FALSE to use the Bayesian posterior covariance matrix of the parameters.

unconditional

logical; if TRUE (and freq == FALSE) then the Bayesian smoothing parameter uncertainty corrected covariance matrix is used, if available.

n_cores

number of cores for generating random variables from a multivariate normal distribution. Passed to mvnfast::rmvn(). Parallelization will take place only if OpenMP is supported (but appears to work on Windows with current R).

n_vals

numeric; how many locations to evaluate the smooth at if data not supplied

burnin

numeric; number of samples to discard as the burnin draws. Only used with method = "mh".

thin

numeric; the number of samples to skip when taking n draws. Results in thin * n draws from the posterior being taken. Only used with method = "mh".

t_df

numeric; degrees of freedom for t distribution proposals. Only used with method = "mh".

rw_scale

numeric; Factor by which to scale posterior covariance matrix when generating random walk proposals. Negative or non finite to skip the random walk step. Only used with method = "mh".

rng_per_smooth

logical; if TRUE, the behaviour of gratia version 0.8.1 or earlier is used, whereby a separate call the the random number generator (RNG) is performed for each smooth. If FALSE, a single call to the RNG is performed for all model parameters

draws

matrix; user supplied posterior draws to be used when method = "user".

partial_match

logical; should smooths be selected by partial matches with select? If TRUE, select can only be a single string to match against.

mvn_method

character; one of "mvnfast" or "mgcv". The default is uses mvnfast::rmvn(), which can be considerably faster at generate large numbers of MVN random values than mgcv::rmvn(), but which might not work for some marginal fits, such as those where the covariance matrix is close to singular.

newdata

Deprecated: use data instead.

ncores

Deprecated; use n_cores instead. The number of cores for generating random variables from a multivariate normal distribution. Passed to mvnfast::rmvn(). Parallelization will take place only if OpenMP is supported (but appears to work on Windows with current R).

Value

A tibble with additional classes "smooth_samples" and '"posterior_samples".

For the "gam" method, the columns currently returned (not in this order) are:

Warning

The set of variables returned and their order in the tibble is subject to change in future versions. Don't rely on position.

Author(s)

Gavin L. Simpson

Examples

load_mgcv()

dat <- data_sim("eg1", n = 400, seed = 2)
m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML")

sms <- smooth_samples(m1, select = "s(x0)", n = 5, seed = 42)

sms


## A factor by example (with a spurious covariate x0)
dat <- data_sim("eg4", n = 1000, seed = 2)

## fit model...
m2 <- gam(y ~ fac + s(x2, by = fac) + s(x0), data = dat)
sms <- smooth_samples(m2, n = 5, seed = 42)
draw(sms)


[Package gratia version 0.9.2 Index]