adj_RI {CommKern}R Documentation

Adjusted Rand Index (ARI)

Description

Description of the adjusted Rand Index function.

Usage

adj_RI(a, b)

Arguments

a

a vector of classifications; this must be a vector of characters, integers, numerics, or a factor, but not a list.

b

a vector of classifications

Details

In information theory, the Rand Index (also called the Rand Measure) is a measure of the similarity between two data clusterings or classifications. If N is the set of elements and X and Y are the partition of N into n subsets, then the Rand Index is composed of four subsets: (a) the number of pairs of elements in N that are in the same subset in in X and the same subset in Y; (b) the number of pairs of elements in N that are in different subsets in X and different subsets in Y; (c) the number of pairs of elements in N that are in the same subset in X but different subsets in Y; and (d) the number of pairs of elements in N that are in different subsets in X but the same subset in Y. The adjusted Rand Index is the corrected-for-chance version of the Rand Index, which establishes a baseline by using the expected similarity of all pairwise comparisons between clusterings specified by a random model. The ARI can yield negative results if the index is less than the expected index.

Value

a scalar with the adjusted Rand Index (ARI)

See Also

NMI, purity

Examples

set.seed(7)
x <- sample(x = rep(1:3, 4), 12)

set.seed(18)
y <- sample(x = rep(1:3, 4), 12)

adj_RI(x,y)


[Package CommKern version 1.0.1 Index]