sync_cluster {funtimes} | R Documentation |
Time Series Clustering based on Trend Synchronism
Description
Cluster time series with a common parametric trend using the
sync_test
function
(Lyubchich and Gel 2016; Ghahari et al. 2017).
Usage
sync_cluster(formula, rate = 1, alpha = 0.05, ...)
Arguments
formula |
an object of class " |
rate |
rate of removal of time series. Default is 1 (i.e., if the hypothesis of synchronism is rejected one time series is removed at a time to re-test the remaining time series). Integer values above 1 are treated as the number of time series to be removed. Values from 0 to 1 are treated as a fraction of the time series to be removed. |
alpha |
significance level for testing the hypothesis of a common trend
(using |
... |
arguments to be passed to |
Details
The sync_cluster
function recursively clusters time series having
a pre-specified common parametric trend until there is no time series left.
Starting with the given N
time series, the sync_test
function
is used to test for a common trend. If the null hypothesis of common trend is not
rejected by sync_test
, the time series are grouped
(i.e., assigned to a cluster). Otherwise, the time series with the largest
contribution to the test statistics are temporarily removed (the number of time
series to remove depends on the rate
of removal), and sync_test
is applied again. The contribution to the test statistic is assessed by the
WAVK test statistic calculated for each time series.
Value
A list with the elements:
cluster |
an integer vector indicating the cluster to which each time series is
allocated. A label |
elements |
a list with names of the time series in each cluster. |
The further elements combine results of sync_test
for each cluster with
at least two elements (that is, single-element clusters labeled with
'0'
are excluded):
estimate |
a list with common parametric trend estimates obtained by
|
pval |
a list of |
statistic |
a list with values of |
ar_order |
a list of AR filter orders used in |
window_used |
a list of local windows used in |
all_considered_windows |
a list of all windows considered in
|
WAVK_obs |
a list of WAVK test statistics obtained in |
Author(s)
Srishti Vishwakarma, Vyacheslav Lyubchich
References
Ghahari A, Gel YR, Lyubchich V, Chun Y, Uribe D (2017).
“On employing multi-resolution weather data in crop insurance.”
In Proceedings of the SIAM International Conference on Data Mining (SDM17) Workshop on Mining Big Data in Climate and Environment (MBDCE 2017).
Lyubchich V, Gel YR (2016).
“A local factor nonparametric test for trend synchronism in multiple time series.”
Journal of Multivariate Analysis, 150, 91–104.
doi:10.1016/j.jmva.2016.05.004.
See Also
Examples
## Not run:
## Simulate 4 autoregressive time series,
## 3 having a linear trend and 1 without a trend:
set.seed(123)
T = 100 #length of time series
N = 4 #number of time series
X = sapply(1:N, function(x) arima.sim(n = T,
list(order = c(1, 0, 0), ar = c(0.6))))
X[,1] <- 5 * (1:T)/T + X[,1]
plot.ts(X)
# Finding clusters with common linear trends:
LinTrend <- sync_cluster(X ~ t)
## Sample Output:
##[1] "Cluster labels:"
##[1] 0 1 1 1
##[1] "Number of single-element clusters (labeled with '0'): 1"
## plotting the time series of the cluster obtained
for(i in 1:max(LinTrend$cluster)) {
plot.ts(X[, LinTrend$cluster == i],
main = paste("Cluster", i))
}
## Simulating 7 autoregressive time series,
## where first 4 time series have a linear trend added
set.seed(234)
T = 100 #length of time series
a <- sapply(1:4, function(x) -10 + 0.1 * (1:T) +
arima.sim(n = T, list(order = c(1, 0, 0), ar = c(0.6))))
b <- sapply(1:3, function(x) arima.sim(n = T,
list(order = c(1, 0, 0), ar = c(0.6))))
Y <- cbind(a, b)
plot.ts(Y)
## Clustering based on linear trend with rate of removal = 2
# and confidence level for the synchronism test 90%
LinTrend7 <- sync_cluster(Y ~ t, rate = 2, alpha = 0.1, B = 99)
## Sample output:
##[1] "Cluster labels:"
##[1] 1 1 1 0 2 0 2
##[1] "Number of single-element clusters (labeled with '0'): 2"
## End(Not run)