dis_ppca {mlmts}R Documentation

Constructs a pairwise distance matrix relying on a piecewise representation based on PCA

Description

dis_ppca returns a pairwise distance matrix based on an extension of the procedure proposed by Wan et al. (2022). The function can also be used for dimensionality reduction purposes.

Usage

dis_ppca(X, w = 2, var_rate = 0.9, features = F)

Arguments

X

A list of MTS (numerical matrices).

w

The number of segments (in the time dimension) in which we want to divide the MTS (default is 2).

var_rate

Rate of retained variability concerning the dimensionality-reduced MTS samples (default is 0.90).

features

Logical. If features = FALSE (default), a distance matrix is returned. Otherwise, the function returns a dataset of feature vectors.

Details

Given a collection of MTS, the function returns the pairwise distance matrix, where the distance between two MTS \boldsymbol X_T and \boldsymbol Y_T is defined as

d_{PPCA}(\boldsymbol X_{T}, \boldsymbol Y_{T})=\Big|\Big|vec\big(\widehat{\boldsymbol \Sigma}_a ^{\boldsymbol X_T}\big) -vec\big(\widehat{\boldsymbol \Sigma}_a^{\boldsymbol Y_T}\big)\Big|\Big|,

where \widehat{\boldsymbol \Sigma}_a ^{\boldsymbol X_T} and \widehat{\boldsymbol \Sigma}_a ^{\boldsymbol Y_T} are estimates of the covariance matrices based on a piecewise representation for which the original MTS \boldsymbol X_T and \boldsymbol Y_T, respectively, are divided into a number of w local segments (in the time dimension). If we use the function to perform dimensionality reduction (features = TRUE), then for a given series \boldsymbol X_T, matrix \widehat{\boldsymbol \Sigma}_a ^{\boldsymbol X_T} is decomposed by executing the standard PCA and a certain number of principal components are retained (according to the parameter var_rate). Function dis_ppca returns the reduced counterpart of \boldsymbol X_T, which is constructed from \boldsymbol X_T by considering the matrix of scores with respect to the retained principal components.

Value

The computed pairwise distance matrix.

Author(s)

Ángel López-Oriona, José A. Vilar

References

Wan X, Li H, Zhang L, Wu YJ (2022). “Dimensionality reduction for multivariate time-series data mining.” The Journal of Supercomputing, 78(7), 9862–9878.

Examples

reduced_dataset <- dis_ppca(RacketSports$data[1], features = TRUE) # Reducing
# the dimensionality of the first MTS in dataset RacketSports
reduced_dataset
distance_matrix <- dis_ppca(RacketSports$data) # Computing the
# corresponding distance matrix for all MTS in dataset RacketSports
# (by default, features = F)

[Package mlmts version 1.1.1 Index]