dis_ppca {mlmts} | R Documentation |
Constructs a pairwise distance matrix relying on a piecewise representation based on PCA
Description
dis_ppca
returns a pairwise distance matrix based on an extension of
the procedure proposed by Wan et al. (2022). The
function can also be used for dimensionality reduction purposes.
Usage
dis_ppca(X, w = 2, var_rate = 0.9, features = F)
Arguments
X |
A list of MTS (numerical matrices). |
w |
The number of segments (in the time dimension) in which we want to divide the MTS (default is 2). |
var_rate |
Rate of retained variability concerning the dimensionality-reduced MTS samples (default is 0.90). |
features |
Logical. If |
Details
Given a collection of MTS, the function returns the pairwise distance matrix,
where the distance between two MTS \boldsymbol X_T
and \boldsymbol Y_T
is defined
as
d_{PPCA}(\boldsymbol X_{T}, \boldsymbol Y_{T})=\Big|\Big|vec\big(\widehat{\boldsymbol \Sigma}_a ^{\boldsymbol X_T}\big)
-vec\big(\widehat{\boldsymbol \Sigma}_a^{\boldsymbol Y_T}\big)\Big|\Big|,
where \widehat{\boldsymbol \Sigma}_a ^{\boldsymbol X_T}
and \widehat{\boldsymbol \Sigma}_a ^{\boldsymbol Y_T}
are estimates of the covariance matrices based on a piecewise representation for which the
original MTS \boldsymbol X_T
and \boldsymbol Y_T
, respectively,
are divided into a number of w
local segments (in the time dimension).
If we use the function to perform dimensionality reduction (features = TRUE
),
then for a given series \boldsymbol X_T
, matrix \widehat{\boldsymbol \Sigma}_a ^{\boldsymbol X_T}
is decomposed by executing the standard PCA and a certain number of
principal components are retained (according to the parameter var_rate
).
Function dis_ppca
returns the reduced counterpart of \boldsymbol X_T
,
which is constructed from \boldsymbol X_T
by considering the
matrix of scores with respect to the retained principal components.
Value
The computed pairwise distance matrix.
Author(s)
Ángel López-Oriona, José A. Vilar
References
Wan X, Li H, Zhang L, Wu YJ (2022). “Dimensionality reduction for multivariate time-series data mining.” The Journal of Supercomputing, 78(7), 9862–9878.
Examples
reduced_dataset <- dis_ppca(RacketSports$data[1], features = TRUE) # Reducing
# the dimensionality of the first MTS in dataset RacketSports
reduced_dataset
distance_matrix <- dis_ppca(RacketSports$data) # Computing the
# corresponding distance matrix for all MTS in dataset RacketSports
# (by default, features = F)