R: Similarity/dissimilarity index between two functions

kma.similarity {briKmeans}

R Documentation

Similarity/dissimilarity index between two functions

Description

kma.similarity computes a similarity/dissimilarity measure between two functions f and g. Users can choose among different types of measures.

Usage

kma.similarity(x.f = NULL, y0.f = NULL, y1.f = NULL,
x.g = NULL, y0.g = NULL, y1.g = NULL, similarity.method, unif.grid = TRUE)

Arguments

`x.f`	vector length.f: abscissa grid where function `f` and his first derivatives `f'` is evaluated. length.f: numbrt of abscissa values where `f` is evaluated. `x.f` must always be provided.
`y0.f`	vector length.f or matrix length.f X d: evaluations of function `f` on the abscissa grid `x.f`. length.f: number of abscissa values where `f` is evaluated. d (only if `f` and `g` are multidimensional) number of function's components, i.e. `f` is `d`-dimensional curve. Default value of `y0.f` is `NULL`. The vector`y0.f` must be provided if the chosen `similarity.method` concerns original functions.
`y1.f`	vector length.f or matrix length.f X d: evaluations of `f` first derivative, i.e., `f'`, on the abscissa grid `x.f`. Default value of `y1.f` is `NULL`. The vector `y1.f` must be provided if the chosen `similarity.method` concerns function first derivatives.
`x.g`	vector length.g: abscissa grid where function `g` and his first derivatives `g'` is evaluated. length.g: numbrt of abscissa values where `g` is evaluated. `x.g` must always be provided.
`y0.g`	vector length.g or matrix length.g X d: evaluations of function `g` on the abscissa grid `x.g`. length.g: number of abscissa values where `g` is evaluated. d (only if `f` and `g` are multidimensional) number of function's components, i.e. `g` is `d`-dimensional curve. Default value of `y0.g` is `NULL`. The vector `y0.g` must be provided if the chosen `similarity.method` concerns original functions.
`y1.g`	vector length.g or matrix length.g X d: evaluations of `g` first derivative, i.e., `g'`, on the abscissa grid `x.g`. Default value is of `y1.g` `NULL`. The vector `y1.g` must be provided if the chosen `similarity.method` concerns function first derivatives.
`similarity.method`	character: similarity/dissimilarity between `f` and `g`. Possible choices are: `'d0.pearson'`, `'d1.pearson'`, `'d0.L2'`, `'d1.L2'`, `'d0.L2.centered'`, `'d1.L2.centered'`. Default value is `'d1.pearson'`. See details.
`unif.grid`	boolean: if equal to `TRUE` the similarity measure is computed over an uniform grid built in the intersection domain of the two functions, that is an additional discretization is performed. If equal to `FALSE` the additional discretization is not performed, so the functions are supposed to be already defined on the same abscissa grid and the grid is supposed to be fine enough to well compute similarity.

Details

We report the list of the currently available similarities/dissimilarities. Note that all norms and inner products are computed over D, that is the intersection of the domains of f and g. \overline{f} and \overline{g} denote the mean value, respectively, of functions f and g.

1. 'd0.pearson': this similarity measure is the cosine of the angle between the two functions f and g.

\frac{<f,g>_{L^2}}{\|{f}\|_{L^2} \|{g}\|_{L^2}}

2. 'd1.pearson': this similarity measure is the cosine of the angle between the two function derivatives f' and g'.

\frac{<f',g'>_{L^2}}{\|{f'}\|_{L^2} \|{g'}\|_{L^2}}

3. 'd0.L2': this dissimilarity measure is the L2 distance of the two functions f and g normalized by the length of the common domain D.

\frac{\|{f-g}\|_{L^2}}{|D|}

4. 'd1.L2': this dissimilarity measure is the L2 distance of the two function first derivatives f' and g' normalized by the length of the common domain D.

\frac{\|{f'-g'}\|_{L^2}}{|D|}

5. 'd0.L2.centered': this dissimilarity measure is the L2 distance of f-\overline{f} and g-\overline{g} normalized by the length of the common domain D.

\frac{\|{(f-\overline{f})-(g-\overline{g})}\|_{L^2}}{|D|}

6. 'd1.L2.centered': this dissimilarity measure is the L2 distance of f'-\overline{f'} and g'-\overline{g'} normalized by the length of the common domain D.

\frac{\|{(f'-\overline{f'})-(g'-\overline{g'})}\|_{L^2}}{|D|}

For multidimensional functions, if similarity.method='d0.pearson' or 'd1.pearson' the similarity/dissimilarity measure is computed via the average of the indexes in all directions.

The coherence properties specified in Sangalli et al. (2010) implies that if similarity.method is set to 'd0.L2', 'd1.L2', 'd0.L2.centered' or 'd1.L2.centered', value of warping.method must be 'shift' or 'NOalignment'. If similarity.method is set to 'd0.pearson' or 'd1.pearson' all values for warping.method are allowed.

Value

scalar: similarity/dissimilarity measure between the two functions f and g computed via the similarity/dissimilarity measure specified.

Author(s)

Alice Parodi, Mirco Patriarca, Laura Sangalli, Piercesare Secchi, Simone Vantini, Valeria Vitelli.

References

Sangalli, L.M., Secchi, P., Vantini, S., Vitelli, V., 2010. "K-mean alignment for curve clustering". Computational Statistics and Data Analysis, 54, 1219-1233.

Sangalli, L.M., Secchi, P., Vantini, S., 2014. "Analysis of AneuRisk65 data: K-mean Alignment". Electronic Journal of Statistics, Special Section on "Statistics of Time Warpings and Phase Variations", Vol. 8, No. 2, 1891-1904.