jaccard {mlr3measures} | R Documentation |
Jaccard Similarity Index
Description
Measure to compare two or more sets w.r.t. their similarity.
Usage
jaccard(sets, na_value = NaN, ...)
Arguments
sets |
( |
na_value |
( |
... |
( |
Details
For two sets and
, the Jaccard Index is defined as
If more than two sets are provided, the mean of all pairwise scores is calculated.
This measure is undefined if two or more sets are empty.
Value
Performance value as numeric(1)
.
Meta Information
Type:
"similarity"
Range:
Minimize:
FALSE
References
Jaccard, Paul (1901). “Étude comparative de la distribution florale dans une portion des Alpes et du Jura.” Bulletin de la Société Vaudoise des Sciences Naturelles, 37, 547-579. doi:10.5169/SEALS-266450.
Bommert A, Rahnenführer J, Lang M (2017). “A Multicriteria Approach to Find Predictive and Sparse Models with Stable Feature Selection for High-Dimensional Data.” Computational and Mathematical Methods in Medicine, 2017, 1–18. doi:10.1155/2017/7907163.
Bommert A, Lang M (2021). “stabm: Stability Measures for Feature Selection.” Journal of Open Source Software, 6(59), 3010. doi:10.21105/joss.03010.
See Also
Package stabm which implements many more stability measures with included correction for chance.
Other Similarity Measures:
phi()
Examples
set.seed(1)
sets = list(
sample(letters[1:3], 1),
sample(letters[1:3], 2)
)
jaccard(sets)