diss.CID {TSclust}R Documentation

Complexity-Invariant Distance Measure For Time Series

Description

Computes the distance based on the Euclidean distance corrected by the complexity estimation of the series.

Usage

diss.CID(x, y)

Arguments

x

Numeric vector containing the first of the two time series.

y

Numeric vector containing the second of the two time series.

Details

This distance is defined

CID(x,y) = ED(x,y) \times CF(x,y)

where CF(x,y) is a complexity correction factor defined as:

max(CE(x), CE(y)) / min(CE(x), CE(y))

and CE(x) is a compexity estimate of a time series x. diss.CID therefore increases the distance between series with different complexities. If the series have the same complexity estimate, the distance defenerates Euclidean distance. The complexity is defined in diss.CID as:

CE(x) = \sqrt{ \sum_{t=1} (x_{t+1} - x_t)^2 }

Value

The computed dissimilarity.

Author(s)

Pablo Montero Manso, José Antonio Vilar.

References

Batista, G. E., Wang, X., & Keogh, E. J. (2011). A Complexity-Invariant Distance Measure for Time Series. In SDM (Vol. 31, p. 32).

Montero, P and Vilar, J.A. (2014) TSclust: An R Package for Time Series Clustering. Journal of Statistical Software, 62(1), 1-43. http://www.jstatsoft.org/v62/i01/.

See Also

diss, diss.CORT

Examples

n = 100
x <- rnorm(n)  #generate sample series, white noise and a wiener process
y <- cumsum(rnorm(n))

diss.CID(x, y)

z <- rnorm(n)
w <- cumsum(rnorm(n))
series = rbind(x, y, z, w)
diss(series, "CID")



[Package TSclust version 1.3.1 Index]