R: Dynamic Time Warping

dtw {IncDTW}

R Documentation

Dynamic Time Warping

Description

Calculate the DTW distance, cost matrices and direction matrices including the warping path two multivariate time series.

Usage

dtw(Q, C, dist_method = c("norm1", "norm2", "norm2_square"), 
    step_pattern = c("symmetric2", "symmetric1"), ws = NULL,
    return_cm = FALSE,
    return_diffM = FALSE,
    return_wp = FALSE,
    return_diffp = FALSE,
    return_QC = FALSE)
                     
cm(Q, C, dist_method = c("norm1", "norm2", "norm2_square"), 
   ws = NULL, ...)
   
## S3 method for class 'idtw'
print(x, digits = getOption("digits"), ...)

## S3 method for class 'idtw'
summary(object, ...)

is.idtw(x)

Arguments

`Q`	Query time series. Q needs to be one of the following: (1) a one dimensional vector, (2) a matrix where each row is one observations and each column is one dimension of the time series, or (3) a matrix of differences/ costs (diffM, cm). If Q and C are matrices they need to have the same number of columns.
`C`	Candidate time series. C needs to be one of the following: (1) a one dimensional vector, (2) a matrix where each row is one observations and each column is one dimension of the time series, or (3) if Q is a matrix of differences or costs C needs to be the respective character string 'diffM' or 'cm'.
`dist_method`	character, describes the method of distance measure for multivariate time series (this parameter is ignored for univariate time series). Currently supported methods are 'norm1' (default, is the Manhattan distance), 'norm2' (is the Euclidean distance) and 'norm2_square'. For the function `cm()` the parameter `dist_method` can also be a user defined distance function (see details and examples).
`step_pattern`	character, describes the step pattern. Currently implemented are the patterns `symmetric1` and `symmetric2`, see details.
`ws`	integer, describes the window size for the sakoe chiba window. If NULL, then no window is applied. (default = NULL)
`return_cm`	logical, if TRUE then the Matrix of costs (the absolute value) is returned. (default = FALSE)
`return_diffM`	logical, if TRUE then the Matrix of differences (not the absolute value) is returned. (default = FALSE)
`return_wp`	logical, if TRUE then the warping path is returned. (default = FALSE) If return_diffp == TRUE, then return_wp is set to TRUE as well.
`return_diffp`	logical, if TRUE then the path of differences (not the absolute value) is returned. (default = FALSE)
`return_QC`	logical, if TRUE then the input vectors Q and C are appended to the returned list. This is useful for the `plot.idtw` function. (default = FALSE)
`x`	output from `dtw` or `idtw`.
`object`	any R object
`...`	additional arguments, e.g. passed to `print`, `summary`, or a user defined distance function for `cm()`
`digits`	passed to `round` and `print`

Details

The dynamic time warping distance is the element in the last row and last column of the global cost matrix.

For the multivariate case where Q is a matrix of n rows and k columns and C is a matrix of m rows and k columns the dist_method parameter defines the following distance measures:

norm1:

dist(Q_{i,.}, C_{j,.}) = \sum {l = 1:k} |Q_{i,l} - C_{j,l}|

norm2:

dist(Q_{i,.}, C_{j,.}) = (\sum {l = 1:k} (Q_{i,l} - C_{j,l})^2)^0.5

norm2_square:

dist(Q_{i,.}, C_{j,.}) = \sum{l = 1:k} (Q_{i,l} - C_{j,l})^2

The parameter step_pattern describes how the two time series are aligned. If step_pattern == "symmetric1" then

gcm_{i,j} = cm{i,j} + min(gcm_{i-1,j}, gcm{i-1, j-1}, gcm{i, i-1}

If step_pattern == "symmetric2" then

gcm_{i,j} = cm{i,j} + min(gcm_{i-1,j}, cm{i,j}+ gcm{i-1, j-1}, gcm{i, i-1}

To make DTW distances comparable for many time series of different lengths use the normlized_distance with the setting step_method = 'symmetric2'. Please find a more detailed discussion and further references here: http://dtw.r-forge.r-project.org/.

User defined distance function: To calculate the DTW distance measure of two time series a distance function for the local distance of two observations Q[i, ] and C[j, ] of the time series Q and C has to be selected. The predefined distance function are 'norm1', 'norm2' and 'norm2-square'. It is also possible to define a customized distance function and use the cost matrix cm as input for the DTW algorithm, also for the incremental functions. In the following experiment we apply the cosine distance as local distance function:

d_cos(C_i, Q_j) = 1 - (\sum{o=1:O} Q_{io} * C_{jo})/ ((\sum{o=1:O} Q_{io}^2)^0.5 * (\sum{o=1:O} C_{jo}^2)^0.5).

Value

`distance`	the DTW distance, that is the element of the last row and last column of gcm
`normalized_distance`	the normalized DTW distance, that is the distance divided by `N+M`, where N and M are the lengths of the time series Q and C, respectively. If `step_pattern == 'symmetric1'` no normalization is performed and `NA` is returned (see details).
`gcm`	global cost matrix
`dm`	direction matrix (3=up, 1=diagonal, 2=left)
`wp`	warping path
`ii`	indices of Q of the optimal path
`jj`	indices of C of the optimal path
`cm`	Matrix of costs
`diffM`	Matrix of differences
`diffp`	path of differences
`Q`	input Q
`C`	input C

References

Leodolter, M.; Pland, C.; Brändle, N; IncDTW: An R Package for Incremental Calculation of Dynamic Time Warping. Journal of Statistical Software, 99(9), 1-23. doi: 10.18637/jss.v099.i09
Sakoe, H.; Chiba, S., Dynamic programming algorithm optimization for spoken word recognition, Acoustics, Speech, and Signal Processing [see also IEEE Transactions on Signal Processing], IEEE Transactions on , vol.26, no.1, pp. 43-49, Feb 1978. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1163055

Examples


#--- univariate
Q <- cumsum(rnorm(100))
C <- Q[11:100] + rnorm(90, 0, 0.5)
tmp <- dtw(Q = Q, C = C, ws = 15, return_diffM = FALSE, 
           return_QC = TRUE, return_wp = TRUE)
names(tmp)
print(tmp, digits = 3)
plot(tmp)
plot(tmp, type = "warp")


#--- compare different input variations
dtw_base  <- dtw(Q = Q, C = C, ws = 15, return_diffM = TRUE)
dtw_diffM <- dtw(Q = dtw_base$diffM, C = "diffM", ws = 15, 
                 return_diffM = TRUE)
dtw_cm    <- dtw(Q = abs(dtw_base$diffM), C = "cm", ws = 15, 
                 return_diffM = TRUE)

identical(dtw_base$gcm, dtw_cm$gcm)
identical(dtw_base$gcm, dtw_diffM$gcm)

# of course no diffM is returned in the 'cm'-case
dtw_cm$diffM


#--- multivariate case
Q <- matrix(rnorm(100), ncol=2)
C <- matrix(rnorm(80), ncol=2)
dtw(Q = Q, C = C, ws = 30, dist_method = "norm2")


#--- user defined distance function
# We define the distance function d_cos and use it as input for the cost matrix function cm. 
# We can pass the output from cm() to dtw2vec(), and also to idtw2vec() for the 
# incrermental calculation:

d_cos <- function(x, y){ 
  1 - sum(x * y)/(sqrt(sum(x^2)) * sqrt(sum(y^2)))
}

Q <- matrix(rnorm(100), ncol=5, nrow=20)
C <- matrix(rnorm(150), ncol=5, nrow=30)
cm1 <- cm(Q, C, dist_method = d_cos)
dtw2vec(Q = cm1, C = "cm")$distance

res0 <- idtw2vec(Q = cm1[ ,1:20], newObs =  "cm")
idtw2vec(Q = cm1[ ,21:30], newObs =  "cm", gcm_lc = res0$gcm_lc_new)$distance

# The DTW distances -- based on the customized distance function -- of the 
# incremental calculation and the one from scratch are identical.

[Package IncDTW version 1.1.4.4 Index]