CausalMBSTS {CausalMBSTS}  R Documentation 
Causal effect estimation in a multivariate setting
Description
The function estimates the general effect of an intervention in a multivariate time series setting. It uses MCMC to sample from the joint posterior distribution of the parameters of an MBSTS model before the intervention/treatment occurred. Then, it uses the postintervention covariate values to predict the counterfactual potential outcomes. The prediction is done by sampling from the posterior predictive distribution (PPD). Then the causal effect is computed by taking the difference between the observed outcome of each time series and the mean of the PPD (credible intervals are computed accordingly).
Usage
CausalMBSTS(
y,
components,
seas.period = NULL,
cycle.period = NULL,
X = NULL,
dates,
int.date,
alpha = 0.05,
excl.dates = NULL,
horizon = NULL,
H = NULL,
nu0.r = NULL,
s0.r,
nu0.eps = NULL,
s0.eps,
niter,
burn = NULL,
ping = NULL
)
Arguments
y 
t x d data.frame (or matrix) of observations, where d is the number of time series in the multivariate model. 
components 
Character vector specifying the components of the multivariate structural time series model. Possible values are c("trend", "slope", "seasonal", "cycle"). 
seas.period 
Length of the seasonal pattern, if present. 
cycle.period 
Length of the cycle pattern, if present. 
X 
Optional t x N data frame (or matrix) of N predictors. 
dates 
a vector of dates of length t (with elements of class

int.date 
Date of the intervention (must be of class 
alpha 
Level of credible interval to report for the estimated causal effect. Default set to 0.05 (i.e., reporting a twosided 95% credible interval). 
excl.dates 
Optional vector of length t, specifying the dates (if any)
in the post period that should be excluded from the computation of the
causal effect. The elements of the vector must be either 0 (the
corresponding date is retained) or 1 (the corresponding date is excluded).
The first part that corresponds to 
horizon 
Optional, vector of dates (with elements of class

H 
P x P variancecovariance matrix of the regression coefficients. Set
by default to H = c(X'X)^(1) which is akin to the Zellner's gprior. The
value of the scaling factor is set to 
nu0.r 
Degrees of freedom of the InverseWishart prior for each element of Sigma.r, a vector of errors for state r. Set by default to d + 2 (must be greater than d  1). 
s0.r 
Scale matrix of the InverseWishart prior for each Sigma.r, a vector of errors for state r. Must be a (d x d) positive definite. Default set to the variancecovariance matrix of y multiplied by a scaling factor of 0.01. 
nu0.eps 
Degrees of freedom of the InverseWishart prior for Sigma.eps, a vector of observation errors for each time series. Set by default to d + 2 (must be greater than d  1). 
s0.eps 
Scale matrix of the InverseWishart prior for Sigma.eps, a vector of observation errors for each time series. Must be a (d x d) positive definite. Default set to the variancecovariance matrix of y multiplied by a scaling factor of 0.01. 
niter 
Number of MCMC iterations. 
burn 
Desired burnin, set by default to 0.1 * 
ping 
A status message is printed every 
Details
The assumed model is based on Normally distributed disturbance terms. The argument
components
provides flexibility for model formulation, allowing to add
simultaneously up to four components that encapsulate the characteristics of a time series.
The unknown parameters are the variancecovariance matrices of the error terms and, if covariates are provided, the matrix of regression coefficients. Because of conjugacy, the priors placed on the variancecovariance matrices of the error terms are InverseWishart distributions and the arguments (nu0.eps, s0.eps) and (nu0.r, s0.r) regulate their hyperparameters.
The regression coefficients are assumed to follow a matrixNormal prior and, to incorporate a sparsity assumption, the prior mean is set to zero and a vector selecting the relevant covariates is introduced with a data augmentation step.
Sampling from the joint posterior distribution of the states and model parameters is performed with a Gibbs sampler. The estimated model is then used to perform predictions of the counterfactual potential outcomes in the period following the intervention date. In a final step, the predictions are compared to the observed outcomes, thereby defining a causal effect at each time point (pointwise effect).
The output component general.effect
summarizes the estimates of two causal effects: the average
causal effect (temporal average of the pointwise effect) and the cumulative causal effect (cumulative
sum of the pointwise effect).
Run vignette("CausalMBSTS")
for a detailed example.
For further details see Menchetti & Bojinov (2020).
Value
A list with the following components:
mcmc 
An object of class 
predict 
A list with the same components as those produced by the function 
y 
Observations in the analysis period (excluding 
dates 
Dates in the analysis period (excluding 
general 
General causal effect for all iterations. 
general.effect 
Estimated average causal effect, cumulative causal effect and (1 
mean.general 
Pointwise effect. Returns a list if 
lower.general 
Lower bound of a (1 
upper.general 
Upper bound of a (1 
Examples
## Note: The following are toy examples, for a proper analysis we recommend to run
## at least 1000 iterations and check the convergence of the Markov chain
## Example 1 (daily data, d = 3, local level + seasonal + covariates)
# Generating a panel of observations and a vector of dates
set.seed(1)
y < cbind(seq(0.5,100,by=0.5)*0.1 + rnorm(200),
seq(100.25,150,by=0.25)*0.05 + rnorm(200),
seq(1,200,by=1)*(0.01) + rnorm(200, 0, 0.5))
dates < seq.Date(from = as.Date('20190110'),by = "days", length.out = 200)
# Adding a fictional intervention and four covariates (they should be related
# to the outcome but unaffected by the intervention). To illustrate the
# functioning of Bayesian model selection, one covariate is assumed to be
# unrelated to y.
int.date < as.Date('20190401')
y.new < y; y.new[dates >= int.date, ] < y.new[dates >= int.date, ]*1.3
x1 < y[,1]*0.5 + y[,2]*0.3 + y[,3]*0.1
x2 < y[,2]*0.1 + rnorm(dim(y)[1],0,0.5)
x3 < y[,3]*1.2 + rnorm(dim(y)[1],0,0.5)
x4 < rnorm(dim(y)[1], 5, 10)
X < cbind(x1, x2, x3, x4)
# Some plots
oldpar < par(no.readonly = TRUE)
par(mfrow=c(1,3))
for(i in 1:dim(y.new)[2]){
plot(y.new[,i], x = dates, type='l', col='cadetblue', xlab='', ylab='', main= bquote(Y[.(i)]))
lines(y[,i], x = dates, col='orange')
}
par(mfrow=c(1,4))
for(i in 1:dim(X)[2]){
plot(X[,i], type='l', col = 'darkgreen', x = dates, xlab='', ylab='', main = bquote(x[.(i)]))
}
par(oldpar)
# Causal effect estimation
causal.1 < CausalMBSTS(y.new, components = c("trend", "seasonal"), seas.period = 7,
X = X, dates = dates, int.date = int.date, s0.r = 0.1*diag(3),
s0.eps = 0.1*diag(3),niter = 50, burn = 5,
horizon = as.Date(c('20190408','20190728')))
summary(causal.1)
## Example 2 (weekly data, local level + cycle, d = 2)
set.seed(1)
t < seq(from = 0,to = 4*pi, length.out=300)
y < cbind(3*sin(2*t)+rnorm(300), 2*cos(2*t) + rnorm(300))
dates < seq.Date(from = as.Date("20150101"), by = "week", length.out=300)
int.date < as.Date("20200227")
y[dates >= int.date,] < y[dates >= int.date,]+2
# Some plots
plot(y = y[,1], x=dates, type="l", col="cadetblue")
lines(y = y[,2], x = dates, col = "orange")
abline(v=int.date, col="red")
# Causal effect estimation
causal.2 < CausalMBSTS(y, components = c("trend", "cycle"), cycle.period = 75,
dates = dates, int.date = int.date, s0.r = 0.01*diag(2),
s0.eps = 0.1*diag(2), niter = 100, burn = 10)
summary(causal.2)