R: General univariate linear-Gaussian state space models

ssm_ulg {bssm}

R Documentation

General univariate linear-Gaussian state space models

Description

Construct an object of class ssm_ulg by directly defining the corresponding terms of the model.

Usage

ssm_ulg(
  y,
  Z,
  H,
  T,
  R,
  a1 = NULL,
  P1 = NULL,
  init_theta = numeric(0),
  D = NULL,
  C = NULL,
  state_names,
  update_fn = default_update_fn,
  prior_fn = default_prior_fn
)

Arguments

`y`	Observations as time series (or vector) of length `n`.
`Z`	System matrix Z of the observation equation. Either a vector of length m, a m x n matrix, or object which can be coerced to such.
`H`	A vector of standard deviations. Either a scalar or a vector of length n.
`T`	System matrix T of the state equation. Either a m x m matrix or a m x m x n array, or object which can be coerced to such.
`R`	Lower triangular matrix R the state equation. Either a m x k matrix or a m x k x n array, or object which can be coerced to such.
`a1`	Prior mean for the initial state as a vector of length m.
`P1`	Prior covariance matrix for the initial state as m x m matrix.
`init_theta`	Initial values for the unknown hyperparameters theta (i.e. unknown variables excluding latent state variables).
`D`	Intercept terms `D_t` for the observations equation, given as a scalar or vector of length n.
`C`	Intercept terms `C_t` for the state equation, given as a m times 1 or m times n matrix.
`state_names`	A character vector defining the names of the states.
`update_fn`	A function which returns list of updated model components given input vector theta. This function should take only one vector argument which is used to create list with elements named as `Z`, `H`, `T`, `R`, `a1`, `P1`, `D`, and `C`, where each element matches the dimensions of the original model It's best to check the internal dimensions with `str(model_object)` as the dimensions of input arguments can differ from the final dimensions. If any of these components is missing, it is assumed to be constant wrt. theta.
`prior_fn`	A function which returns log of prior density given input vector theta.

Details

The general univariate linear-Gaussian model is defined using the following observational and state equations:

y_t = D_t + Z_t \alpha_t + H_t \epsilon_t, (\textrm{observation equation})

\alpha_{t+1} = C_t + T_t \alpha_t + R_t \eta_t, (\textrm{transition equation})

where \epsilon_t \sim N(0, 1), \eta_t \sim N(0, I_k) and \alpha_1 \sim N(a_1, P_1) independently of each other. Here k is the number of disturbance terms which can be less than m, the number of states.

The update_fn function should take only one vector argument which is used to create list with elements named as Z, H T, R, a1, P1, D, and C, where each element matches the dimensions of the original model. If any of these components is missing, it is assumed to be constant wrt. theta. Note that while you can input say R as m x k matrix for ssm_ulg, update_fn should return R as m x k x 1 in this case. It might be useful to first construct the model without updating function and then check the expected structure of the model components from the output.

Value

An object of class ssm_ulg.

Examples


# Regression model with time-varying coefficients
set.seed(1)
n <- 100
x1 <- rnorm(n)
x2 <- rnorm(n)
b1 <- 1 + cumsum(rnorm(n, sd = 0.5))
b2 <- 2 + cumsum(rnorm(n, sd = 0.1))
y <- 1 + b1 * x1 + b2 * x2 + rnorm(n, sd = 0.1)

Z <- rbind(1, x1, x2)
H <- 0.1
T <- diag(3)
R <- diag(c(0, 1, 0.1))
a1 <- rep(0, 3)
P1 <- diag(10, 3)

# updates the model given the current values of the parameters
update_fn <- function(theta) {
  R <- diag(c(0, theta[1], theta[2]))
  dim(R) <- c(3, 3, 1)
  list(R = R, H = theta[3])
}
# prior for standard deviations as half-normal(1)
prior_fn <- function(theta) {
  if(any(theta < 0)) {
    log_p <- -Inf 
  } else {
    log_p <- sum(dnorm(theta, 0, 1, log = TRUE))
  }
  log_p
}

model <- ssm_ulg(y, Z, H, T, R, a1, P1, 
  init_theta = c(1, 0.1, 0.1), 
  update_fn = update_fn, prior_fn = prior_fn, 
  state_names = c("level", "b1", "b2"),
  # using default values, but being explicit for testing purposes
  C = matrix(0, 3, 1), D = numeric(1))

out <- run_mcmc(model, iter = 5000)
out
sumr <- summary(out, variable = "state", times = 1:n)
sumr$true <- c(b1, b2, rep(1, n))
library(ggplot2)
ggplot(sumr, aes(x = time, y = Mean)) +
geom_ribbon(aes(ymin = `2.5%`, ymax = `97.5%`), alpha = 0.5) +
geom_line() + 
geom_line(aes(y = true), colour = "red") + 
facet_wrap(~ variable, scales = "free") +
theme_bw()

# Perhaps easiest way to construct a general SSM for bssm is to use the 
# model building functionality of KFAS:
library("KFAS")

model_kfas <- SSModel(log(drivers) ~ SSMtrend(1, Q = 5e-4)+
  SSMseasonal(period = 12, sea.type = "trigonometric", Q = 0) +
 log(PetrolPrice) + law, data = Seatbelts, H = 0.005)

# use as_bssm function for conversion, kappa defines the 
# prior variance for diffuse states
model_bssm <- as_bssm(model_kfas, kappa = 100)

# define updating function for parameter estimation
# we can use SSModel and as_bssm functions here as well
# (for large model it is more efficient to do this 
# "manually" by constructing only necessary matrices,
# i.e., in this case  a list with H and Q)

prior_fn <- function(theta) {
  if(any(theta < 0)) -Inf else sum(dnorm(theta, 0, 0.1, log = TRUE))
}
 
update_fn <- function(theta) {
  
  model_kfas <- SSModel(log(drivers) ~ SSMtrend(1, Q = theta[1]^2)+
    SSMseasonal(period = 12, 
      sea.type = "trigonometric", Q = theta[2]^2) +
    log(PetrolPrice) + law, data = Seatbelts, H = theta[3]^2)
  
  # the bssm_model object is essentially list so this is fine
  as_bssm(model_kfas, kappa = 100, init_theta = init_theta,
    update_fn = update_fn, prior_fn = prior_fn) 
}

init_theta <- rep(1e-2, 3)
names(init_theta) <- c("sd_level", "sd_seasonal", "sd_y")

model_bssm <- update_fn(init_theta)


out <- run_mcmc(model_bssm, iter = 10000, burnin = 5000) 
out

# Above the regression coefficients are modelled as 
# time-invariant latent states. 
# Here is an alternative way where we use variable D so that the
# coefficients are part of parameter vector theta. Note however that the 
# first option often preferable in order to keep the dimension of theta low.

updatefn2 <- function(theta) {
  # note no PetrolPrice or law variables here
  model_kfas2 <- SSModel(log(drivers) ~ SSMtrend(1, Q = theta[1]^2)+
    SSMseasonal(period = 12, sea.type = "trigonometric", Q = theta[2]^2), 
    data = Seatbelts, H = theta[3]^2)
  
  X <- model.matrix(~ -1 + law + log(PetrolPrice), data = Seatbelts)
  D <- t(X %*% theta[4:5])
  as_bssm(model_kfas2, D = D, kappa = 100)
}
prior2 <- function(theta) {
 if(any(theta[1:3] < 0)) {
  -Inf
 } else {
   sum(dnorm(theta[1:3], 0, 0.1, log = TRUE)) +
   sum(dnorm(theta[4:5], 0, 10, log = TRUE))
 }
}
init_theta <- c(rep(1e-2, 3), 0, 0)
names(init_theta) <- c("sd_level", "sd_seasonal", "sd_y", "law", "Petrol")
model_bssm2 <- updatefn2(init_theta)
model_bssm2$theta <- init_theta
model_bssm2$prior_fn <- prior2
model_bssm2$update_fn <- updatefn2

out2 <- run_mcmc(model_bssm2, iter = 10000, burnin = 5000) 
out2

[Package bssm version 2.0.2 Index]