fowlkes_mallows_pairs {clevr}R Documentation

Fowlkes-Mallows Index of Linked Pairs

Description

Computes the Fowlkes-Mallows index for a set of predicted coreferent (linked) pairs given a set of ground truth coreferent pairs.

Usage

fowlkes_mallows_pairs(true_pairs, pred_pairs, ordered = FALSE)

Arguments

true_pairs

set of true coreferent pairs stored in a matrix or data.frame, where rows index pairs and columns index the ids of the constituents. Any pairs not included are assumed to be non-coreferent. Duplicate pairs (including equivalent pairs with reversed ids) are automatically removed.

pred_pairs

set of predicted coreferent pairs, following the same specification as true_pairs.

ordered

whether to treat the element pairs as ordered—i.e. whether pair (x,y)(x, y) is distinct from pair (y,x)(y, x) for xyx \neq y. Defaults to FALSE, which is appropriate for clustering, undirected link prediction, record linkage etc.

Details

The Fowlkes-Mallows index is defined as the geometric mean of precision PP and recall RR:

PR.\sqrt{P R}.

References

Fowlkes, E. B. and Mallows, C. L. "A Method for Comparing Two Hierarchical Clusterings." Journal of the American Statistical Association 78:383, 553-569, (1983). doi:10.1080/01621459.1983.10478008.

Examples

true_pairs <- rbind(c(1,2), c(2,3), c(1,3)) # ground truth is 3-clique
pred_pairs <- rbind(c(1,2), c(2,3))         # prediction misses one edge
num_pairs <- 3                              # assuming 3 elements
fowlkes_mallows_pairs(true_pairs, pred_pairs, num_pairs)


[Package clevr version 0.1.2 Index]