deseats {deseats} | R Documentation |
Locally Weighted Regression for Trend and Seasonality in Equidistant Time Series under Short Memory
Description
Simultaneously estimate the trend and the seasonality via locally weighted regression in an equidistant time series under short memory. The default setting uses an iterative plug-in algorithm for identifying the asymptotically globally optimal bandwidth for smoothing.
Usage
deseats(
y,
smoothing_options = set_options(),
bwidth_start = NULL,
inflation_rate = c("optimal", "naive"),
correction_factor = FALSE,
autocor = TRUE,
drop = NULL,
error_model = c("free", "ARMA"),
nar_lim = c(0, 3),
nma_lim = c(0, 3),
arma_mean = FALSE
)
Arguments
y |
a numerical vector or a time series object of class |
smoothing_options |
an S4 object of class |
bwidth_start |
a single numeric value that is only relevant if the slot
|
inflation_rate |
a character vector of length one that indicates, which inflation rate
to use in the bandwidth selection; for a local linear trend, we have
|
correction_factor |
A logical vector of length one; theoretically, a
larger bandwidth to estimate the sum of autocovariances from residuals of
pilot trend and seasonality estimates is advisable than for estimating trend
and seasonality; for |
autocor |
a logical vector of length one; indicates whether to consider
autocorrelated errors ( |
drop |
a numeric vector of length one that indicates the proportion of
the observations to not include at each boundary in the bandwidth estimation
process, if a bandwidth is selected automatically; the default is
|
error_model |
a character vector of length one that indicates whether
for |
nar_lim |
only valid for |
nma_lim |
only valid for |
arma_mean |
only valid for |
Details
Trend and seasonality are estimated based on the additive nonparametric regression model for an equidistant time series
y_t = m(x_t) + s(x_t) + \epsilon_t,
where y_t
is the observed time series with t=1,...n
, x_t = t / n
is the rescaled time
on the interval [0, 1]
, m(x_t)
is a smooth and deterministic
trend function, s(x_t)
is a (slowly changing) seasonal component with
seasonal period p_s
and \epsilon_t
are stationary errors
with E(\epsilon_t) = 0
and short-range dependence (see for example also
Feng, 2013, for a similar model, where the stochastic term is however i.i.d.).
It is assumed that m
and s
can be approximated locally by a polynomial of
small order and by a trigonometric polynomial, respectively. Through locally
weighted regression, m
and s
can therefore be estimated given
a suitable bandwidth.
The iterative-plug-in (IPI) algorithm, which numerically minimizes the asymptotic mean squared error (AMISE) to select a bandwidth is an extension of Feng (2013) to the case with short-range dependence in the errors. To achieve this goal, the error variance in the AMISE in Feng (2013) is replaced by the sum of autocovariances of the error process and this quantity is being estimated using a slightly adjusted version of the Bühlmann (1996) algorithm. This procedure is similar to the method described in Feng, Gries and Fritz (2020), where data-driven local polynomial regression with an automatically selected bandwidth is used to estimate the trend according to a model without seasonality and where the same adjusted Bühlmann (1996) algorithm is considered to estimate the sum of autocovariances in the error process.
Define I[m^{(k)}] = \int_{c_b}^{d_b} [m^{(k)}(x)]^2 dx
, \beta_{(k)} = \int_{-1}^{1} u^k
K(u) du
and R(K) = \int_{-1}^{1} K^{2}(u) du
, where p
is the order of the (local) polynomial
considered for m
,
k = p + 1
is the order of the asymptotically equivalent kernel K
for estimating m
, 0 \leq c_{b}< d_{b} \leq 1
, and
c_f
is the variance factor, i.e. the sum of autocovariances divided by
2\pi
.
Furthermore, we define
C_{1} = \frac{I[m^{(k)}] \beta_{(k)}^2}{(k!)^2}
and
C_{2} = \frac{2 \pi c_{f} (d_b - c_b)[R(K) + (p_s - 1) R(W)]}{nh}
with h
being the bandwidth, n
being the number of
observations and W
being the weighting function considered in the
weighted least squares approach, for example a second-order kernel function
with support on [-1,1]
. The AMISE is then
AMISE(h) = h^{2k}C_{1} + C_{2}.
The function calculates suitable estimates for c_f
, the variance
factor, and I[m^{(k)}]
over different iterations. In each
iteration, a bandwidth is obtained in accordance with the AMISE that once
more serves as an input for the following iteration. The process repeats
until either convergence or the 40th iteration is reached. For further
details on the asymptotic theory or the algorithm, please consult Feng,
Gries and Fritz (2020) or Feng et al. (2019).
To apply the function, only few arguments are needed: a data input y
,
an object with smoothing options smoothing_options
returned by
set_options
and
a starting value for the relative bandwidth
bwidth_start
. Aside from y
, each argument has a default setting.
By default, a local linear trend is considered. In some cases, a local cubic
trend may, however, be more suitable. For more specific information on the input arguments
consult the section Arguments.
When applying the function, an optimal bandwidth is obtained based on the IPI algorithm proposed by Feng, Gries and Fritz (2020). In a second step, the nonparametric trend of the series and the seasonality are calculated with respect to the chosen bandwidth.
Note that with this function m(x_t)
and s(x_t)
can be
estimated without a parametric
model assumption for the error series. Thus, after estimating and removing
the trend and the seasonality, any suitable parametric model, e.g. an
ARMA(p,q
) model for errors = "autocor"
, can be fitted to the
residuals (see arima
).
Usually, a local cubic trend (smoothing_options = set_options(order_poly = 3)
)
gives more suitable results. Moreover, if the resulting bandwidth is too large,
adjustments to the arguments inflation_rate
and drop
should be
tried first in that order before any other changes
to the input arguments.
The default print method for this function delivers key numbers such as the bandwidth considered for smoothing.
NOTE:
This function implements C++ code by means
of the Rcpp
and
RcppArmadillo
packages for
better performance.
Value
The function returns and S4 object with the following elements (access them
via @
):
boundary_method
the applied boundary method.
bwidth
the globally applied bandwidth in the smoothing process; if not if no input is given in the function call, this is the automatically selected optimal bandwidth.
decomp
An object of class
"mts"
that consists of the decomposed time series data.frequency
the frequency of the time series.
kernel_fun
the second-order kernel function considered for weighting.
order_poly
the order of polynomial considered locally for the trend.
order_poly
the order of polynomial considered locally for the trend.
sum_autocov
the estimated sum of autocovariances.
ts_name
the object name of the initially provided time series object.
weights_wfun
a matrix that gives the weights of the weighting function
K
at each estimation time point; ; ifn
is the length of the given time series andb
is the applied (relative) bandwidth, then the first row of the weighting system gives the weighting function weights when estimating att=1
, the second row gives the weights when estimating att=2
and so on for all left-hand side boundary points until the middle row, which contains the weights used at all interior points; the rows following the middle row contain the weights for right-hand side boundary points (the rows are ordered chronologically)weights
an array with many slices that represent the weighting systems for various filters; each slice is a matrix, which gives the weighting system to estimate a component, for example trend + seasonality, as a weighted average from the given time series; if
n
is the length of the given time series andb
is the applied (relative) bandwidth, then the first row of the weighting system gives the weights to obtain estimates att=1
, the second row gives the weights to obtain estimates att=2
and so on for all left-hand side boundary points until the middle row, which contains the weights used at all interior points; the rows following the middle row contain the weights for right-hand side boundary points (the rows are ordered chronologically); the slice names are"Trend"
,"Season"
and"Combined"
, where"Combined"
are the weights to estimate trend + seasonality combined.
Author(s)
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn University),
Author and Package CreatorYuanhua Feng (Department of Economics, Paderborn University),
Author
References
Bühlmann, P. (1996). Locally Adaptive Lag-Window Spectral Estimation. Journal of Time Series Analysis, 17(3): 247-270. DOI: 10.1111/j.1467-9892.1996.tb00275.x.
Feng, Y. (2013). An iterative plug-in algorithm for decomposing seasonal time series using the Berlin Method. Journal of Applied Statistics, 40(2): 266-281. DOI: 10.1080/02664763.2012.740626.
Feng, Y., Gries. T, and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32(2): 510-533. DOI: 10.1080/10485252.2020.1759598.
Examples
Xt <- log(EXPENDITURES)
smoothing_options <- set_options(order_poly = 3)
est <- deseats(Xt, smoothing_options = smoothing_options)
est
plot(est, which = 1)