R: Fit Bayesian Graphical Vector Autoregressive (GVAR) Models...

stan_gvar {tsnet}

R Documentation

Fit Bayesian Graphical Vector Autoregressive (GVAR) Models with Stan

Description

This function fits a Bayesian GVAR model to the provided data using Stan. The estimation procedure is described further in Siepe, Kloft & Heck (2023) <doi:10.31234/osf.io/uwfjc>. The current implementation allows for a normal prior on the temporal effects and either an Inverse Wishart or an LKJ prior on the contemporaneous effects. 'rstan' is used as a backend for fitting the model in Stan. Data should be provided in long format, where the columns represent the variables and the rows represent the time points. Data are automatically z-scaled for estimation.

Usage

stan_gvar(
  data,
  beep = NULL,
  priors = NULL,
  method = "sampling",
  cov_prior = "IW",
  rmv_overnight = FALSE,
  iter_sampling = 500,
  iter_warmup = 500,
  n_chains = 4,
  n_cores = 1,
  center_only = FALSE,
  ...
)

Arguments

`data`	A data frame or matrix containing the time series data of a single subject. The data should be in long format, where the columns represent the variables and the rows represent the time points. See the example data [ts_data] for the correct format.
`beep`	A vector of beeps with length of 'nrow(data)'. The beep indicator can be used to remove overnight effects from the last beep of a day to the first beep of the next day. This should be a vector of positive integers. If left empty, the function will assume that there are no overnight effects to remove.
`priors`	A list of prior distributions for the model parameters. This should be a named list, with names corresponding to the parameter names and values corresponding to the prior distributions. The following priors can be specified: 'prior_Beta_loc' A matrix of the same dimensions as the beta matrix 'B' containing the mean of the prior distribution for the beta coefficients. 'prior_Beta_scale' A matrix of the same dimensions as the beta matrix 'B' containing the standard deviation of the prior distribution for the beta coefficients.
`method`	A string indicating the method to use for fitting the model. Options are "sampling" (for MCMC estimation) or "variational" (for variational inference). We currently recommend only using MCMC estimation.
`cov_prior`	A string indicating the prior distribution to use for the covariance matrix. Options are "LKJ" or "IW" (Inverse-Wishart).
`rmv_overnight`	A logical indicating whether to remove overnight effects. Default is 'FALSE'. If 'TRUE', the function will remove overnight effects from the last beep of a day to the first beep of the next day. This requires the 'beep' argument to be specified.
`iter_sampling`	An integer specifying the number of iterations for the sampling method. Default is 500.
`iter_warmup`	An integer specifying the number of warmup iterations for the sampling method. Default is 500.
`n_chains`	An integer specifying the number of chains for the sampling method. Default is 4. If variational inference is used, the number of iterations is calculated as 'iter_sampling'*'n_chains'.
`n_cores`	An integer specifying the number of cores to use for parallel computation. Default is 1. [rstan] is used for parallel computation.
`center_only`	A logical indicating whether to only center (and not scale) the data. Default is 'FALSE'.
`...`	Additional arguments passed to the 'rstan::sampling' or 'rstan::vb' function.

Details

General Information

In a Graphical Vector Autoregressive (GVAR) model of lag 1, each variable is regressed on itself and all other variables at the previous timepoint to obtain estimates of the temporal association between variables (encapsulated in the beta matrix). This is the "Vector Autoregressive" part of the model. Additionally, the innovation structure at each time point (which resembles the residuals) is modeled to obtain estimates of the contemporaneous associations between all variables (controlling for the lagged effects). This is typically represented in the partial correlation (pcor) matrix. If the model is represented and interpreted as a network, variables are called *nodes*, *edges* represent the statistical association between the nodes, and *edge weights* quantify the strength of these associations.

Model

Let Y be a matrix with n rows and p columns, where n_t is the number of time points and p is the number of variables. The GVAR model is given by the following equations:

Y_t = B* Y_{t-1} + zeta_t

zeta_t \sim N(0, \Sigma)

where B is a 'p x p' matrix of VAR coefficients between variables i and j (beta_ij), \zeta_t contains the innovations at time point t, and \Sigma is a 'p x p'covariance matrix. The inverse of \Sigma is the precision matrix, which is used to obtain the partial correlations between variables (rho_ij). The model setup is explained in more detail in Siepe, Kloft & Heck (2023) <doi:10.31234/osf.io/uwfjc>.

Prior Setup

For the p x p temporal matrix B (containing the beta coefficients), we use a normal prior distribution on each individual parameter:

\beta_{ij} \sim N(PriorBetaLoc_{ij}, PriorBetaScale_{ij})

where 'PriorBetaLoc' is the mean of the prior distribution and 'PriorBetaScale' is the standard deviation of the prior distribution. The default prior is a weakly informative normal distribution with mean 0 and standard deviation 0.5. The user can specify a different prior distribution by a matrix 'prior_Beta_loc' and a matrix 'prior_Beta_scale' with the same dimensions as B.

Both a Lewandowski-Kurowicka-Joe (LKJ) and an Inverse-Wishart (IW) distribution can be used as a prior for the contemporaneous network. However, the LKJ prior does not allow for direct specifications of priors on the partial correlations. We implemented a workaround to enable priors on specific partial correlations (described below). We consider this feature experimental would advise users wishing to implement edge-specific priors in the contemporaneous network to preferentially use IW priors.

The LKJ prior is a distribution on the correlation matrix, which is parameterized by the shape parameter \eta. To enable edge-specific priors on the partial correlations, we use the workaround of a "joint" prior that, in addition to the LKJ on the correlation matrix itself, allows for an additional beta prior on each of the partial correlations. We first assigned an uninformed LKJ prior to the Cholesky factor decomposition of the correlation matrix of innovations:

\Omega_L \sim LKJ-Cholesky(\eta)

. For \eta = 1, this implies a symmetric marginal scaled beta distribution on the zero-order correlations \omega_{ij}.

(\omega_{ij}+1)/2 \sim Beta(p/2, p/2)

We can then obtain the covariance matrix and, subsequently, the precision matrix (see Siepe, Kloft & Heck (2023)) for details. The second part of the prior is a beta prior on each partial correlation \rho_{ij} (obtained from the off-diagonal elements of the precision matrix). This prior was assigned by transforming the partial correlations to the interval of 0,1 and then assigning a proportional (mean-variance parameterized) beta prior:

(\rho_{ij}+1)/2 \sim Beta_{prop}(PriorRhoLoc, PriorRhoScale)

A beta location parameter of 0.5 translates to an expected correlation of 0. The variance parameter of sqrt(0.5) implies a uniform distribution of partial correlations. The user can specify a different prior distribution by a matrix 'prior_Rho_loc' and a matrix 'prior_Rho_scale' with the same dimensions as the partial correlation matrix. Additionally, the user can change eta via the 'prior_Eta' parameter.

The Inverse-Wishart prior is a distribution on the innovation covariance matrix 'Sigma':

\Sigma \sim IW(\nu, S)

where \nu is the degrees of freedom and S is the scale matrix. We here use the default prior of

nu = delta + p - 1

for the degrees of freedom, where \delta is defined as s_{\rho}^{-1}-1 and s_{\rho} is the standard deviation of the implied marginal beta distribution of the partial correlations. For the scale matrix S, we use the identity matrix I_p of order p. The user can set a prior on the expected standard deviation of the partial correlations by specifying a 'prior_Rho_marginal' parameter. The default value is 0.25, which has worked well in a simulation study. Additionally, the user can specify a 'prior_S' parameter to set a different scale matrix.

Sampling The model can be fitted using either MCMC sampling or variational inference via [rstan]. Per default, the model is fitted using the Stan Hamiltonian Monte Carlo (HMC) No U-Turn (NUTS) sampler with 4 chains, 500 warmup iterations and 500 sampling iterations. We use a default target average acceptance probability 'adapt_delta' of 0.8. As the output is returned as a standard 'stanfit' object, the user can use the 'rstan' package to extract and analyze the results and obtain convergence diagnostics.

Value

A 'tsnet_fit' object in list format. The object contains the following elements:

`fit`	A stanfit object containing the fitted model.
`arguments`	The number of variables "p", the number of time points "n_t", the column names "cnames", and the arguments used in the function call.

Examples


# Load example data
data(ts_data)
example_data <- ts_data[1:100,1:3]

# Fit the model
fit <- stan_gvar(example_data,
                 method = "sampling",
                 cov_prior = "IW",
                 n_chains = 2)
print(fit)

[Package tsnet version 0.1.0 Index]