dis_mcc {mlmts}R Documentation

Constructs a pairwise distance matrix based on maximal cross-correlations

Description

dis_mcc returns a pairwise distance matrix based on an extension of the procedure proposed by Egri et al. (2017). The function can also be used for dimensionality reduction purposes.

Usage

dis_mcc(X, max_lag = 20, delta = 0.7, features = F)

Arguments

X

A list of MTS (numerical matrices).

max_lag

The maximum number of lags for the computation of the cross-correlations (default is 20).

delta

The threshold value concerning the maximal cross-correlations (default is 0.7).

features

Logical. If features = FALSE (default), a distance matrix is returned. Otherwise, the function returns a dataset of feature vectors.

Details

Given a collection of MTS, the function returns the pairwise distance matrix, where the distance between two MTS \boldsymbol X_T and \boldsymbol Y_T is defined as

d_{MCC}(\boldsymbol X_{T}, \boldsymbol Y_{T})=\Big|\Big|vec\big(\widehat{\boldsymbol \Theta}^{\boldsymbol X_T}\big) -vec\big(\widehat{\boldsymbol \Theta}^{\boldsymbol Y_T}\big)\Big|\Big|,

where \widehat{\boldsymbol \Theta}^{\boldsymbol X_T} and \widehat{\boldsymbol \Theta}^{\boldsymbol Y_T} are matrices containing pairwise estimated maximal cross-correlations (in absolute value) for series \boldsymbol X_T and \boldsymbol Y_T, respectively, and the operator vec(\cdot) creates a vector by concatenating the columns of the matrix received as input. If we use the function to perform dimensionality reduction (features = TRUE), then for a given series \boldsymbol X_T, a new matrix \widehat{\boldsymbol \Theta}^{\boldsymbol X_T}_\delta is constructed by keeping the entries of matrix \widehat{\boldsymbol \Theta}^{\boldsymbol X_T} which are above \delta (and setting all the remaining entries to zero). The connected components of the graph defined by matrix \widehat{\boldsymbol \Theta}^{\boldsymbol X_T}_\delta are computed along with their corresponding centers (variables). Function dis_mcc returns the reduced counterpart of \boldsymbol X_T, which is constructed from \boldsymbol X_T by removing all the variables which were not selected as centers of the corresponding components.

Value

The computed pairwise distance matrix.

Author(s)

Ángel López-Oriona, José A. Vilar

References

Egri A, Horváth I, Kovács F, Molontay R, Varga K (2017). “Cross-correlation based clustering and dimension reduction of multivariate time series.” In 2017 IEEE 21st International Conference on Intelligent Engineering Systems (INES), 000241–000246. IEEE.

Examples

reduced_dataset <- dis_mcc(RacketSports$data[1], features = TRUE) # Reducing
# the dimensionality of the first MTS in dataset RacketSports
reduced_dataset
distance_matrix <- dis_mcc(Libras$data) # Computing the
# corresponding distance matrix for all MTS in dataset Libras
# (by default, features = F)

[Package mlmts version 1.1.1 Index]