R: Calculating WAIC using an offline algorithm

calculateWAIC {nimble}

R Documentation

Calculating WAIC using an offline algorithm

Description

In addition to the core online algorithm, NIMBLE implements an offline WAIC algorithm that can be computed on the results of an MCMC. In contrast to NIMBLE's built-in online WAIC, offline WAIC can compute only conditional WAIC and does not allow for grouping data nodes.

Usage

calculateWAIC(mcmc, model, nburnin = 0, thin = 1)

Arguments

`mcmc`	An MCMC object (compiled or uncompiled) or matrix or dataframe of MCMC samples as the first argument of `calculateWAIC`.
`model`	A model (compiled or uncompiled) as the second argument of `calculateWAIC`. Only required if `mcmc` is a matrix/dataframe of samples.
`nburnin`	The number of pre-thinning MCMC samples to remove from the beginning of the posterior samples for offline WAIC calculation via `calculateWAIC` (default = 0). These samples are discarded in addition to any burn-in specified when running the MCMC.
`thin`	Thinning factor interval to apply to the samples for offline WAIC calculation using `calculateWAIC` (default = 1, corresponding to no thinning).

Details

The ability to calculate WAIC post hoc after all MCMC sampling has been done has certain advantages (e.g., allowing a user to calculate WAIC from MCMC chains run separately) in addition to providing compatibility with versions of NIMBLE before 0.12.0. This functionality includes the ability to call the calculateWAIC function on an MCMC object or matrix of samples after running an MCMC and without setting up the MCMC initially to use WAIC.

Important: The necessary variables to compute WAIC (all stochastic parent nodes of the data nodes) must have been monitored when setting up the MCMC.

Also note that while the model argument can be either a compiled or uncompiled model, the model must have been compiled prior to calling calculateWAIC.

See help(waic) for details on using NIMBLE's recommended online algorithm for WAIC.

Offline WAIC (WAIC computed after MCMC sampling)

As an alternative to online WAIC, NIMBLE also provides a function, calculateWAIC, that can be called on an MCMC object or a matrix of samples, after running an MCMC. This function does not require that one set enableWAIC = TRUE nor WAIC = TRUE when calling runMCMC. The function checks that the necessary variables were monitored in the MCMC and returns an error if they were not. This function behaves identically to the calculateWAIC method of an MCMC object. Note that to use this function when using nimbleMCMC one would need to build the model outside of nimbleMCMC.

The calculateWAIC function requires either an MCMC object or a matrix (or dataframe) of posterior samples plus a model object. In addition, one can provide optional burnin and thin arguments.

In addition, for compatibility with older versions of NIMBLE (prior to v0.12.0), one can also use the calculateWAIC method of the MCMC object to calculate WAIC after all sampling has been completed.

The calculateWAIC() method accepts a single argument, nburnin, equivalent to the nburnin argument of the calculateWAIC function described above.

The calculateWAIC method can only be used if the enableWAIC argument to configureMCMC or to buildMCMC is set to TRUE, or if the NIMBLE option enableWAIC is set to TRUE. If a user attempts to call calculateWAIC without having set enableWAIC = TRUE (either in the call to configureMCMC, or buildMCMC, or as a NIMBLE option), an error will occur.

The calculateWAIC function and method calculate the WAIC based on Equations 5, 12, and 13 in Gelman et al. (2014) (i.e., using pWAIC2).

Note that there is not a unique value of WAIC for a model. The calculateWAIC function and method only provide the conditional WAIC, namely the version of WAIC where all parameters directly involved in the likelihood are treated as theta for the purposes of Equation 5 from Gelman et al. (2014). As a result, the user must set the MCMC monitors (via the monitors argument) to include all stochastic nodes that are parents of any data nodes; by default the MCMC monitors are only the top-level nodes of the model. For more detail on the use of different predictive distributions, see Section 2.5 from Gelman et al. (2014) or Ariyo et al. (2019). Also note that WAIC relies on a partition of the observations, i.e., 'pointwise' prediction. In calculateWAIC the sum over log pointwise predictive density values treats each data node as contributing a single value to the sum. When a data node is multivariate, that data node contributes a single value to the sum based on the joint density of the elements in the node. Note that if one wants the WAIC calculation via calculateWAIC to be based on the joint predictive density for each group of observations (e.g., grouping the observations from each person or unit in a longitudinal data context), one would need to use a multivariate distribution for the observations in each group (potentially by writing a user-defined distribution).

For more control over and flexibility in how WAIC is calculated, see help(waic).

Author(s)

Joshua Hug and Christopher Paciorek

References

Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. Journal of Machine Learning Research 11: 3571-3594.

Gelman, A., Hwang, J. and Vehtari, A. (2014). Understanding predictive information criteria for Bayesian models. Statistics and Computing 24(6): 997-1016.

Ariyo, O., Quintero, A., Munoz, J., Verbeke, G. and Lesaffre, E. (2019). Bayesian model selection in linear mixed models for longitudinal data. Journal of Applied Statistics 47: 890-913.

Vehtari, A., Gelman, A. and Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing 27: 1413-1432.

Hug, J.E. and Paciorek, C.J. (2021). A numerically stable online implementation and exploration of WAIC through variations of the predictive density, using NIMBLE. arXiv e-print <arXiv:2106.13359>.

Examples

code <- nimbleCode({
  for(j in 1:J) {
    for(i in 1:n) 
      y[j, i] ~ dnorm(mu[j], sd = sigma)
    mu[j] ~ dnorm(mu0, sd = tau)
  }
  tau ~ dunif(0, 10)
  sigma ~ dunif(0, 10)
})
J <- 5
n <- 10
y <- matrix(rnorm(J*n), J, n)
Rmodel <- nimbleModel(code, constants = list(J = J, n = n), data = list(y = y),
                      inits = list(tau = 1, sigma = 1))

## Make sure the needed variables are monitored.
## Only conditional WAIC without data grouping is available via this approach.
conf <- configureMCMC(Rmodel, monitors = c('mu', 'sigma'))
## Not run: 
Cmodel <- compileNimble(Rmodel)
Rmcmc <- buildMCMC(conf)
Cmcmc <- compileNimble(Rmcmc, project = Rmodel)
output <- runMCMC(Cmcmc, niter = 1000)
calculateWAIC(Cmcmc)           # Can run on the MCMC object
calculateWAIC(output, Rmodel)  # Can run on the samples directly

## Apply additional burnin (additional to any burnin already done in the MCMC.
calculateWAIC(Cmcmc, burnin = 500)

## End(Not run)

[Package nimble version 1.2.1 Index]