jaccard {abdiv} | R Documentation |
These functions transform the input vectors to binary or presence/absence format, then compute a distance or dissimilarity.
jaccard(x, y) sorenson(x, y) kulczynski_first(x, y) kulczynski_second(x, y) rogers_tanimoto(x, y) russel_rao(x, y) sokal_michener(x, y) sokal_sneath(x, y) yule_dissimilarity(x, y)
x, y |
Numeric vectors |
Many of these indices are covered in Koleff et al. (2003), so we adopt their
notation. For two vectors x
and y
, we define three quantities:
a is the number of species that are present in both x
and y
,
b is the number of species that are present in y
but
not x
,
c is the number of species that are present in x
but
not y
, and
d is the number of species absent in both vectors.
The quantity d is seldom used in ecology, for good reason. For details, please see the discussion on the "double zero problem," in section 2 of chapter 7.2 in Legendre & Legendre.
The Jaccard index of dissimilarity is 1 - a / (a + b + c), or
one minus the proportion of shared species, counting over both samples
together. Relation of jaccard()
to other definitions:
Equivalent to R's built-in dist()
function with
method = "binary"
.
Equivalent to vegdist()
with method = "jaccard"
and binary = TRUE
.
Equivalent to the jaccard()
function in
scipy.spatial.distance
, except that we always convert vectors to
presence/absence.
Equivalent to 1 - S_7 in Legendre & Legendre.
Equivalent to 1 - β_j, as well as β_{cc}, and β_g in Koleff (2003).
The SÃ¸renson or Dice index of dissimilarity is
1 - 2a / (2a + b + c), or one minus the average proportion of shared
species, counting over each sample individually. Relation of
sorenson()
to other definitions:
Equivalent to the dice()
function in
scipy.spatial.distance
, except that we always convert vectors to
presence/absence.
Equivalent to the sorclass
calculator in Mothur, and to
1 - whittaker
.
Equivalent to D_13 = 1 - S_8 in Legendre & Legendre.
Equivalent to 1 - β_{sor} in Koleff (2003). Also equivalent to Whittaker's beta diversity (the second definition, β_w = (S / \bar{a}) - 1), as well as β_{-1}, β_t, β_{me}, and β_{hk}.
I have not been able to track down the original reference for the first and second Kulczynski indices, but we have good formulas from Legendre & Legendre. The first Kulczynski index is 1 - a / (b + c), or one minus the ratio of shared to unshared species.
Relation of kulczynski_first
to other definitions:
Equivalent to 1 - S_12 in Legendre & Legendre.
Equivalent to the kulczynski
calculator in Mothur.
Some people refer to the second Kulczynski index as the Kulczynski-Cody index. It is defined as one minus the average proportion of shared species in each vector,
d = 1 - \frac{1}{2} ≤ft ( \frac{a}{a + b} + \frac{a}{a + c} \right ).
Relation of kulczynski_second
to other definitions:
Equivalent to 1 - S_13 in Legendre & Legendre.
Equivalent to the kulczynskicody
calculator in Mothur.
Equivalent to one minus the Kulczynski similarity in Hayek (1994).
Equivalent to vegdist()
with method = "kulczynski"
and
binary = TRUE
.
The Rogers-Tanimoto distance is defined as
(2b + 2c) / (a + 2b + 2c + d). Relation of rogers_tanimoto()
to other definitions:
Equivalent to the rogerstanimoto()
function in
scipy.spatial.distance
, except that we always convert vectors to
presence/absence.
Equivalent to 1 - S_2 in Legendre & Legendre.
The Russel-Rao distance is defined
(b + c + d) / (a + b + c + d), or the fraction of elements not present
in both vectors, counting double absences. Relation of russel_rao()
to
other definitions:
Equivalent to the russelrao()
function in
scipy.spatial.distance
, except that we always convert vectors to
presence/absence.
Equivalent to 1 - S_11 in Legendre & Legendre.
The Sokal-Michener distance is defined as
(2b + 2c) / (a + 2b + 2c + d). Relation of sokal_michener()
to
other definitions:
Equivalent to the sokalmichener()
function in
scipy.spatial.distance
, except that we always convert vectors to
presence/absence.
The Sokal-Sneath distance is defined as
(2b + 2c) / (a + 2b + 2c). Relation of sokal_sneath()
to other
definitions:
Equivalent to the sokalsneath()
function in
scipy.spatial.distance
, except that we always convert vectors to
presence/absence.
Equivalent to the anderberg
calculator in Mothur.
Equivalent to 1 - S_10 in Legendre & Legendre.
The Yule dissimilarity is defined as 2bc / (ad + bc). Relation
of yule_dissimilarity()
to other definitions:
Equivalent to the yule()
function in
scipy.spatial.distance
, except that we always convert vectors to
presence/absence.
Equivalent to 1 - S, where S is the Yule coefficient in Legendre & Legendre.
The dissimilarity between x
and y
, based on
presence/absence. The Jaccard, Sorenson, Sokal-Sneath, Yule, and both
Kulczynski dissimilarities are not defined if both x
and y
have no nonzero elements. In addition, the second Kulczynski index and the
Yule index of dissimilarity are not defined if one of the vectors has no
nonzero elements. We return NaN
for undefined values.