LCSSDistance {TSdist} | R Documentation |
Longest Common Subsequence distance for Real Sequences.
Description
Computes the Longest Common Subsequence distance between a pair of numeric time series.
Usage
LCSSDistance(x, y, epsilon, sigma)
Arguments
x |
Numeric vector containing the first time series. |
y |
Numeric vector containing the second time series. |
epsilon |
A positive threshold value that defines the distance. |
sigma |
If desired, a Sakoe-Chiba windowing contraint can be added by specifying a positive integer representing the window size. |
Details
The Longest Common Subsequence for two real sequences is computed.
For this purpose,
the distances between the points of x
and y
are reduced to 0 or 1. If the Euclidean distance between two points x_i
and y_j
is smaller than epsilon
they are considered
equal and their distance is reduced to 0. In the opposite case,
the distance between them is represented with a value of 1.
Once the distance matrix is defined in this manner, the maximum common subsequence is seeked. Of course, as in other Edit Based Distances, gaps or unmatched regions are permitted and they are penalized with a value proportional to their length.
Based on its definition, the length of series x
and y
may be
different.
If desired, a temporal constraint may be added to the LCSS
distance. In this package, only the most basic windowing function, introduced
by H.Sakoe and S.Chiba (1978), is implemented. This function sets a band around the
main diagonal of the distance matrix and avoids the matching of the points that are farther in time than a specified \sigma
.
The size of the window must be a positive integer value. Furthermore, the following condition must be fulfilled:
|length(x)-length(y)| < sigma
Value
d |
The computed distance between the pair of series. |
Author(s)
Usue Mori, Alexander Mendiburu, Jose A. Lozano.
References
Vlachos, M., Kollios, G., & Gunopulos, D. (2002). Discovering similar multidimensional trajectories. In Proceedings 18th International Conference on Data Engineering (pp. 673-684). IEEE Comput. Soc. doi:10.1109/ICDE.2002.994784
Chen, L., & Ng, R. (2004). On The Marriage of Lp-norms and Edit Distance. In Proceedings of the Thirtieth International Conference on Very Large Data Bases (pp. 792–803).
Cuturi, M. (2011). Fast Global Alignment Kernels. In Proceedings of the 28th International Conference on Machine Learning (pp. 929–936).
Gaidon, A., Harchaoui, Z., & Schmid, C. (2011). A time series kernel for action recognition. In BMVC 2011 - British Machine Vision Conference (pp. 63.1–63.11).
Marteau, P.-F., & Gibet, S. (2014). On Recursive Edit Distance Kernels With Applications To Time Series Classification. IEEE Transactions on Neural Networks and Learning Systems, PP(6), 1–13.
Lei, H., & Sun, B. (2007). A Study on the Dynamic Time Warping in Kernel Machines. In 2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System (pp. 839–845).
Pree, H., Herwig, B., Gruber, T., Sick, B., David, K., & Lukowicz, P. (2014). On general purpose time series similarity measures and their use as kernel functions in support vector machines. Information Sciences, 281, 478–495.
See Also
To calculate this distance measure using ts
, zoo
or xts
objects see TSDistances
. To calculate distance matrices of time series databases using this measure see TSDatabaseDistances
.
Examples
# The objects example.series3 and example.series4 are two
# numeric series of length 100 and 120 contained in the TSdist
# package.
data(example.series3)
data(example.series4)
# For information on their generation and shape see
# help page of example.series.
help(example.series)
# Calculate the LCSS distance for two series of different length
# with no windowing constraint:
LCSSDistance(example.series3, example.series4, epsilon=0.1)
# Calculate the LCSS distance for two series of different length
# with a window of size 30:
LCSSDistance(example.series3, example.series4, epsilon=0.1, sigma=30)