reduced_mutual_information {clustAnalytics}R Documentation

Reduced Mutual Information

Description

Computes the Newman's Reduced Mutual Information (RMI) as defined in (Newman et al. 2020).

Usage

reduced_mutual_information(
  c1,
  c2,
  base = 2,
  normalized = FALSE,
  method = "approximation2",
  warning = TRUE
)

Arguments

c1, c2

membership vectors

base

base of the logarithms used in the calculations. Changing it only scales the final value. By default set to e=exp(1).

normalized

If true, computes the normalized version of the corrected mutual information.

method

Can be "hybrid" (default, combines Monte Carlo with analytical formula), "monte_carlo", approximation1" (appropriate for partitions into many very small clusters), or "approximation2" (for partitions into few larger clusters).

warning

set to false to ignore the warning.

Details

The implementation is based on equations 23 (25 for the normalized case) and 29 in (Newman et al. 2020). The evaluations of the \Gamma functions can get too large and cause overflow issues in the intermediate steps, so the following term of equation 29:

\frac{1}{2} \log \frac{\Gamma(\mu R) \Gamma(\nu S)} {(\Gamma(\nu)\Gamma(R))^S (\Gamma(\mu)\Gamma(S))^R }

is rewritten as

\frac{1}{2} (\log\Gamma(\mu R) + \log\Gamma(\nu S) - S\log(\Gamma(\nu) - S\log(\Gamma(R) - R\log\Gamma(\mu) - R\log\Gamma(R) )

, and then the function lgamma is used instead of gamma.

Value

The value of Newman's RMI (a scalar).

References

Newman MEJ, Cantwell GT, Young J (2020). “Improved mutual information measure for clustering, classification, and community detection.” Phys. Rev. E, 101(4), 042304. doi:10.1103/PhysRevE.101.042304.


[Package clustAnalytics version 0.5.5 Index]