adj_RI {CommKern} | R Documentation |
Description of the adjusted Rand Index function.
adj_RI(a, b)
a |
a vector of classifications; this must be a vector of characters, integers, numerics, or a factor, but not a list. |
b |
a vector of classifications |
In information theory, the Rand Index (also called the Rand Measure) is a measure of the similarity between two data clusterings or classifications. If N is the set of elements and X and Y are the partition of N into n subsets, then the Rand Index is composed of four subsets: (a) the number of pairs of elements in N that are in the same subset in in X and the same subset in Y; (b) the number of pairs of elements in N that are in different subsets in X and different subsets in Y; (c) the number of pairs of elements in N that are in the same subset in X but different subsets in Y; and (d) the number of pairs of elements in N that are in different subsets in X but the same subset in Y. The adjusted Rand Index is the corrected-for-chance version of the Rand Index, which establishes a baseline by using the expected similarity of all pairwise comparisons between clusterings specified by a random model. The ARI can yield negative results if the index is less than the expected index.
a scalar with the adjusted Rand Index (ARI)
set.seed(7)
x <- sample(x = rep(1:3, 4), 12)
set.seed(18)
y <- sample(x = rep(1:3, 4), 12)
adj_RI(x,y)