dis_gcc {mlmts} | R Documentation |
Constructs a pairwise distance matrix based on the generalized cross-correlation
Description
dis_gcc
returns a pairwise distance matrix based on the generalized
cross-correlation measure introduced by Alonso and Pena (2019).
Usage
dis_gcc(X, lag_max = 1, features = FALSE)
Arguments
X |
A list of MTS (numerical matrices). |
lag_max |
The maximum lag considered to compute the generalized cross-correlation. |
features |
Logical. If |
Details
Given a collection of MTS, the function returns the pairwise distance matrix,
where the distance between two MTS \boldsymbol X_T
and \boldsymbol Y_T
is defined
as
d_{GCC}(\boldsymbol X_T, \boldsymbol Y_T)=\Bigg[\sum_{j_1,j_2=1, j_1 \ne j_2}^{d}
\bigg(\widehat{GCC}(\boldsymbol X_{T,j_1}, \boldsymbol X_{T,j_2} )-\widehat{GCC}(\boldsymbol Y_{T,j_1},\boldsymbol Y_{T,j_2})\bigg)^2\Bigg]^{1/2},
where \boldsymbol X_{T,j}
and \boldsymbol Y_{T,j}
are the j
th dimensions (univariate time series) of
\boldsymbol X_T
and \boldsymbol Y_T
, respectively, and \widehat{GCC}(\cdot, \cdot)
is the estimated genelarized cross-correlation
measure between univariate series proposed by Alonso and Pena (2019).
Value
If features = FALSE
(default), returns a distance matrix based on the distance d_{GCC}
. Otherwise, the function
returns a dataset of feature vectors, i.e., each row in the dataset contains the features employed to compute the
distance d_{GCC}
.
Author(s)
Ángel López-Oriona, José A. Vilar
References
Alonso AM, Pena D (2019). “Clustering time series by linear dependency.” Statistics and Computing, 29(4), 655–676.
Examples
toy_dataset <- AtrialFibrillation$data[1 : 10] # Selecting the first 10 MTS from the
# dataset AtrialFibrillation
distance_matrix <- dis_gcc(toy_dataset) # Computing the pairwise
# distance matrix based on the distance dis_cor
feature_dataset <- dis_gcc(toy_dataset, features = TRUE) # Computing
# the corresponding dataset of features