jaccard {mlr3measures} | R Documentation |
Jaccard Similarity Index
Description
Measure to compare two or more sets w.r.t. their similarity.
Usage
jaccard(sets, na_value = NaN, ...)
Arguments
sets |
( |
na_value |
( |
... |
( |
Details
For two sets A
and B
, the Jaccard Index is defined as
J(A, B) = \frac{|A \cap B|}{|A \cup B|}.
If more than two sets are provided, the mean of all pairwise scores is calculated.
This measure is undefined if two or more sets are empty.
Value
Performance value as numeric(1)
.
Meta Information
Type:
"similarity"
Range:
[0, 1]
Minimize:
FALSE
References
Jaccard, Paul (1901). “Étude comparative de la distribution florale dans une portion des Alpes et du Jura.” Bulletin de la Société Vaudoise des Sciences Naturelles, 37, 547-579. doi:10.5169/SEALS-266450.
Bommert A, Rahnenführer J, Lang M (2017). “A Multicriteria Approach to Find Predictive and Sparse Models with Stable Feature Selection for High-Dimensional Data.” Computational and Mathematical Methods in Medicine, 2017, 1–18. doi:10.1155/2017/7907163.
Bommert A, Lang M (2021). “stabm: Stability Measures for Feature Selection.” Journal of Open Source Software, 6(59), 3010. doi:10.21105/joss.03010.
See Also
Package stabm which implements many more stability measures with included correction for chance.
Other Similarity Measures:
phi()
Examples
set.seed(1)
sets = list(
sample(letters[1:3], 1),
sample(letters[1:3], 2)
)
jaccard(sets)