R: Distance between discrete probability distributions given the...

ddhellingerpar {dad}

R Documentation

Distance between discrete probability distributions given the probabilities on their common support

Description

Hellinger (or Matusita) distance between two discrete probability distributions on the same support (which can be a Cartesian product of q sets) , given the probabilities of the states (which are q-tuples) of the support.

Usage

ddhellingerpar(p1, p2)

Arguments

`p1`	array (or table) the dimension of which is `q`. The first probability distribution on the support.
`p2`	array (or table) the dimension of which is `q`. The second probability distribution on the support.

Details

The Hellinger distance between two discrete distributions p_1 and p_2 is given by: \sqrt{ \sum_x{(\sqrt{p_1(x)} - \sqrt{p_2(x)})^2}}

Notice that some authors divide this expression by \sqrt{2}.

Author(s)

Rachid Boumaza, Pierre Santagostini, Smail Yousfi, Sabine Demotes-Mainard

References

Deza, M.M. and Deza E. (2013). Encyclopedia of distances. Springer.

Examples

# Example 1
p1 <- array(c(1/2, 1/2), dimnames = list(c("a", "b"))) 
p2 <- array(c(1/4, 3/4), dimnames = list(c("a", "b"))) 
ddhellingerpar(p1, p2)

# Example 2
x1 <- data.frame(x = factor(c("A", "A", "A", "B", "B", "B")),
                 y = factor(c("a", "a", "a", "b", "b", "b")))                 
x2 <- data.frame(x = factor(c("A", "A", "A", "B", "B")),
                 y = factor(c("a", "a", "b", "a", "b")))
p1 <- table(x1)/nrow(x1)                 
p2 <- table(x2)/nrow(x2)
ddhellingerpar(p1, p2)

[Package dad version 4.1.2 Index]