R: Function that fits a multivariate correlated product...

ccp_ppm {ppmSuite}

R Documentation

Function that fits a multivariate correlated product partition change point model

Description

ccp_ppm is a function that fits a Bayesian product partition change point model, where the set of change point indicators between time series are correlated.

Usage

ccp_ppm(ydata, model=1,
         nu0, mu0, sigma0,
         mltypes, thetas,
         devs,
         nburn, nskip, nsave,
         verbose = FALSE)

Arguments

`ydata`	An `L \times n` data matrix, where `L` is the number of time series and `n`, the number of time points.
`model`	Determines of model fit is such that there are p_its (model=1) or only p_t (model=2)
`nu0`	Degrees of freedom of the multivariate Student's t-distribution (see section Details).
`mu0`	Location vector of dimension `L` (see section Details).
`sigma0`	Positive definite scale matrix of order `L \times L` (see section Details).
`mltypes`	Type of marginal likelihood. Currently only available is: `mltypes = 1`. Observations within a block are conditionally independent `Normal(\mu, \sigma^2)` variates with mean `\mu` and variance `\sigma^2`. The desired marginal likelihood is obtained after integrating `(\mu, \sigma^2)` with respect to a `Normal-Inverse-Gamma(\mu_{0}, \kappa_{0}, \alpha_{0}, \beta_{0})` prior.
`thetas`	An `L \times q` matrix containing hyperparameters associated with the marginal likelihood. The number of rows `(L)` corresponds to the number of series. The number of columns `(q)` depend on the marginal likelihood: If `mltypes = 1`, then `q = 4` and `thetas` equals the hyperparameter `(\mu_{0}, \kappa_{0}, \alpha_{0}, \beta_{0})` of the Normal-Inverse-Gamma prior.
`devs`	An `L \times (n - 1)` matrix containing the standard deviations of the candidate density associated with the random walk Metropolis-Hastings steps for updating change point probabilities.
`nburn`	The number of initial MCMC iterates to be discarded as burn-in.
`nskip`	The amount to thinning that should be applied to the MCMC chain.
`nsave`	Then number of MCMC iterates to be stored.
`verbose`	Logical indicating whether to print to screen the MCMC progression. The default value is `verbose = FALSE`.

Details

As described in Quinlan et al. (add cite), for each time series \boldsymbol{y}_{i} = (y_{i,1}, \ldots , y_{i,n})':

\boldsymbol{y}_{i} \mid \rho_{i} \sim \prod_{j = 1}^{b_{i}}\mathcal{F}(\boldsymbol{y}_{i,j} \mid \boldsymbol{\theta}_{i})

\rho_{i} \mid (p_{i,1}, \ldots , p_{i,n-1})' \sim \prod_{t \in T_{i}}p_{i,t} \prod_{t \notin T_{i}}(1 - p_{i,t}) : T_{i} = \{\tau_{i,1}, \ldots, \tau_{i,b_{i} - 1}\}

(p_{1,t}, \ldots , p_{L,t})' \sim logit-t(\nu_{0}, \boldsymbol{\mu}_{0}, \boldsymbol{\Sigma}_{0}).

Here, \rho_{i} = \{S_{i,1}, \ldots , S_{i,b_{i}}\} is a partition of the set \{1, \ldots , n\} into b_{i} contiguous blocks, and \boldsymbol{y}_{i,j} = (y_{i,t} : t \in S_{i,j})'. Also, \tau_{i,j} = \max(S_{i,j}) and \mathcal{F}( \cdot \mid \boldsymbol{\theta}_{i}) is a marginal likelihood function which depends on the nature of \boldsymbol{y}_{i}, indexed by a hyperparameter \boldsymbol{\theta}_{i}. In addition, logit-t(\nu_{0}, \boldsymbol{\mu}_{0}, \boldsymbol{\Sigma}_{0}) is the logit of a multivariate Student's t-distribution with degrees of freedom \nu_{0}, location vector \boldsymbol{\mu}_{0} and scale matrix \boldsymbol{\Sigma}_{0}.

Value

The function returns a list containing arrays filled with MCMC iterates corresponding to model parameters. In order to provide more detail, in what follows let M be the number of MCMC iterates collected. The output list contains the following:

C. An M \times \{L(n - 1)\} matrix containing MCMC iterates associated with each series indicators of a change point. The mth row in C is divided into L blocks; the first (n - 1) change point indicators for time series 1, the next (n - 1) change point indicators for time series 2, and so on.
P. An M \times \{L(n - 1)\} matrix containing MCMC iterates associated with each series probability of a change point. The mth row in P is divided into L blocks; the first (n - 1) change point probabilities for time series 1, the next (n - 1) change point probabilities for time series 2, and so on.

Examples


# Generate data that has two series, each with 100 observations
y1 <- replicate(25, rnorm(4, c(-1, 0, 1, 2), c(0.1, 0.25, 0.5, 0.75)))
y2 <- replicate(25, rnorm(4, c(2, 1, 0, -2), c(0.1, 0.25, 0.5, 0.75)))
y <- rbind(c(t(y1)), c(t(y2)))

# Marginal likelihood parameters
thetas <- matrix(1, nrow = 2,ncol = 4)
thetas[1,] <- c(0, 1, 2, 1)
thetas[2,] <- c(0, 1, 2, 1)

# M-H candidate density standard deviations
devs = matrix(0.1, nrow = 2, ncol = (dim(y)[2] - 1))

# Prior parameters for logit-t distribution
L <- nrow(y)
pivar <- 10
picorr <- 0.9
pimu <- rep(-6, L) # mean associated with logit of p_i
piSigma <- pivar*picorr*(rep(1, L) %*% t(rep(1, L))) +
           pivar*(1 - picorr)*diag(L)
nu0 = 3
mu0 = pimu
sigma0 = piSigma

# Fit the bayesian ppm change point model
fit <- ccp_ppm(nburn = 1000, nskip = 1, nsave = 1000, ydata = y, nu0 = nu0,
               mu0 = mu0, sigma0 = sigma0, mltypes = c(1, 1), thetas = thetas,
               devs = devs)

[Package ppmSuite version 0.3.4 Index]