beast-package {beast} | R Documentation |
Bayesian Estimation of Change-Points in the Slope of Multivariate Time-Series
Description
Assume that a temporal process is composed of contiguous segments with differing slopes and replicated noise-corrupted time series measurements are observed. The unknown mean of the data generating process is modelled as a piecewise linear function of time with an unknown number of change-points. The package infers the joint posterior distribution of the number and position of change-points as well as the unknown mean parameters per time-series by MCMC sampling. A-priori, the proposed model uses an overfitting number of mean parameters but, conditionally on a set of change-points, only a subset of them influences the likelihood. An exponentially decreasing prior distribution on the number of change-points gives rise to a posterior distribution concentrating on sparse representations of the underlying sequence, but also available is the Poisson distribution. See Papastamoulis et al (2017) <arXiv:1709.06111> for a detailed presentation of the method.
Details
The beast
package deals with Bayesian estimation of change-points in the slope of multivariate time-series, introduced by Papastamoulis et al (2017). For a given period t = 1,\ldots,T
we observe multiple time series which are assumed independent, each one consisting of multiple measurements (replicates). Each time series is assumed to have its own segmentation, which is common among its replicates. Thus, different time series have distinct mean parameters in the underlying normal distribution. The variance, which is assumed known, can be either shared between different time series or not and in practice it is estimated at a pre-processing stage.
Our model infers the joint posterior distribution of the number and location of change-points by MCMC sampling. For this purpose a Metropolis-Hastings MCMC sampler is used. The main function of the package is beast
.
Assume that the observed data consists of N
time-series, each one consisting of R
variables (or replicates) measured at T
consecutive time-points. The input of the main function beast
should be given in the form of a list
myDataList
with the following attributes:
-
length(myDataList)
should be equal toR
, that is, the number of variables (or replicates) -
dim(myDataList)[[1]]
,\ldots
,dim(myDataList)[[R]]
should be allT\times N
, that is, rows and columns should correspond to time-points and different series, respectively.
Then, a basic usage of the package consists of the following commands:
-
beastRun <- beast( myDataList = myDataList )
-
print(beastRun)
-
plot(beastRun)
which correspond to running the MCMC sampler, printing and plotting output summaries, respectively.
Author(s)
Panagiotis Papastamoulis
Maintainer: Panagiotis Papastamoulis <papapast@yahoo.gr>
References
Papastamoulis P., Furukawa T., van Rhijn N., Bromley M., Bignell E. and Rattray M. (2017). Bayesian detection of piecewise linear trends in replicated time-series with application to growth data modelling. arXiv:1709.06111 [stat.AP]
See Also
beast, print.beast.object, plot.beast.object
Examples
# toy-example (MCMC iterations not enough)
library('beast') # load package
data("FungalGrowthDataset") # load dataset
myIndex <- c(392, 62, 3, 117) # run the sampler only for the
# specific subset of time-series
set.seed(1)
# Run MCMC sampler with very small number of iterations (nIter):
run_mcmc <- beast(myDataList = FungalGrowthDataset, subsetIndex = myIndex,
zeroNormalization = TRUE, nIter = 40, burn = 20)
# Print output:
print(run_mcmc)
# Plot output to file: "beast_plot.pdf"
plot(run_mcmc, fileName = "beast_plot_toy.pdf", timeScale=1/6, xlab = "hours", ylab = "growth")
# Run the following commands to obtain convergence:
## Not run:
# This example illustrates the package using a subset of four
# time-series of the fungal dataset.
library('beast') # load package
data("FungalGrowthDataset") # load dataset
myIndex <- c(392, 62, 3, 117) # run the sampler only for the
# specific subset of time-series
set.seed(1) # optional
# Run MCMC sampler with the default number of iterations (nIter =70000):
run_mcmc <- beast(myDataList = FungalGrowthDataset, subsetIndex = myIndex,
zeroNormalization = TRUE)
# Print output:
print(run_mcmc)
# Plot output to file: "beast_plot.pdf"
plot(run_mcmc, fileName = "beast_plot.pdf", timeScale=1/6, xlab = "hours", ylab = "growth")
# NOTE 1: for a complete analysis remove the `subsetIndex = myIndex` argument.
# NOTE 2: `zeroNormalization = TRUE` is an optional argument that forces all
# time-series to start from zero. It is not supposed to be used
# for other applications.
## End(Not run)