kma.similarity {briKmeans}R Documentation

Similarity/dissimilarity index between two functions

Description

kma.similarity computes a similarity/dissimilarity measure between two functions ff and gg. Users can choose among different types of measures.

Usage

kma.similarity(x.f = NULL, y0.f = NULL, y1.f = NULL,
x.g = NULL, y0.g = NULL, y1.g = NULL, similarity.method, unif.grid = TRUE)

Arguments

x.f

vector length.f: abscissa grid where function ff and his first derivatives ff' is evaluated. length.f: numbrt of abscissa values where ff is evaluated. x.f must always be provided.

y0.f

vector length.f or matrix length.f X d: evaluations of function ff on the abscissa grid x.f. length.f: number of abscissa values where ff is evaluated. d (only if ff and gg are multidimensional) number of function's components, i.e. ff is dd-dimensional curve. Default value of y0.f is NULL. The vectory0.f must be provided if the chosen similarity.method concerns original functions.

y1.f

vector length.f or matrix length.f X d: evaluations of ff first derivative, i.e., ff', on the abscissa grid x.f. Default value of y1.f is NULL. The vector y1.f must be provided if the chosen similarity.method concerns function first derivatives.

x.g

vector length.g: abscissa grid where function gg and his first derivatives gg' is evaluated. length.g: numbrt of abscissa values where gg is evaluated. x.g must always be provided.

y0.g

vector length.g or matrix length.g X d: evaluations of function gg on the abscissa grid x.g. length.g: number of abscissa values where gg is evaluated. d (only if ff and gg are multidimensional) number of function's components, i.e. gg is dd-dimensional curve. Default value of y0.g is NULL. The vector y0.g must be provided if the chosen similarity.method concerns original functions.

y1.g

vector length.g or matrix length.g X d: evaluations of gg first derivative, i.e., gg', on the abscissa grid x.g. Default value is of y1.g NULL. The vector y1.g must be provided if the chosen similarity.method concerns function first derivatives.

similarity.method

character: similarity/dissimilarity between ff and gg. Possible choices are: 'd0.pearson', 'd1.pearson', 'd0.L2', 'd1.L2', 'd0.L2.centered', 'd1.L2.centered'. Default value is 'd1.pearson'. See details.

unif.grid

boolean: if equal to TRUE the similarity measure is computed over an uniform grid built in the intersection domain of the two functions, that is an additional discretization is performed. If equal to FALSE the additional discretization is not performed, so the functions are supposed to be already defined on the same abscissa grid and the grid is supposed to be fine enough to well compute similarity.

Details

We report the list of the currently available similarities/dissimilarities. Note that all norms and inner products are computed over DD, that is the intersection of the domains of ff and gg. f\overline{f} and g\overline{g} denote the mean value, respectively, of functions ff and gg.

1. 'd0.pearson': this similarity measure is the cosine of the angle between the two functions ff and gg.

<f,g>L2fL2gL2 \frac{<f,g>_{L^2}}{\|{f}\|_{L^2} \|{g}\|_{L^2}}

2. 'd1.pearson': this similarity measure is the cosine of the angle between the two function derivatives ff' and gg'.

<f,g>L2fL2gL2 \frac{<f',g'>_{L^2}}{\|{f'}\|_{L^2} \|{g'}\|_{L^2}}

3. 'd0.L2': this dissimilarity measure is the L2 distance of the two functions ff and gg normalized by the length of the common domain DD.

fgL2D \frac{\|{f-g}\|_{L^2}}{|D|}

4. 'd1.L2': this dissimilarity measure is the L2 distance of the two function first derivatives ff' and gg' normalized by the length of the common domain DD.

fgL2D \frac{\|{f'-g'}\|_{L^2}}{|D|}

5. 'd0.L2.centered': this dissimilarity measure is the L2 distance of fff-\overline{f} and ggg-\overline{g} normalized by the length of the common domain DD.

(ff)(gg)L2D \frac{\|{(f-\overline{f})-(g-\overline{g})}\|_{L^2}}{|D|}

6. 'd1.L2.centered': this dissimilarity measure is the L2 distance of fff'-\overline{f'} and ggg'-\overline{g'} normalized by the length of the common domain DD.

(ff)(gg)L2D \frac{\|{(f'-\overline{f'})-(g'-\overline{g'})}\|_{L^2}}{|D|}

For multidimensional functions, if similarity.method='d0.pearson' or 'd1.pearson' the similarity/dissimilarity measure is computed via the average of the indexes in all directions.

The coherence properties specified in Sangalli et al. (2010) implies that if similarity.method is set to 'd0.L2', 'd1.L2', 'd0.L2.centered' or 'd1.L2.centered', value of warping.method must be 'shift' or 'NOalignment'. If similarity.method is set to 'd0.pearson' or 'd1.pearson' all values for warping.method are allowed.

Value

scalar: similarity/dissimilarity measure between the two functions ff and gg computed via the similarity/dissimilarity measure specified.

Author(s)

Alice Parodi, Mirco Patriarca, Laura Sangalli, Piercesare Secchi, Simone Vantini, Valeria Vitelli.

References

Sangalli, L.M., Secchi, P., Vantini, S., Vitelli, V., 2010. "K-mean alignment for curve clustering". Computational Statistics and Data Analysis, 54, 1219-1233.

Sangalli, L.M., Secchi, P., Vantini, S., 2014. "Analysis of AneuRisk65 data: K-mean Alignment". Electronic Journal of Statistics, Special Section on "Statistics of Time Warpings and Phase Variations", Vol. 8, No. 2, 1891-1904.

See Also

kma


[Package briKmeans version 1.0 Index]