entropy {posterior} | R Documentation |
Normalized entropy
Description
Normalized entropy, for measuring dispersion in draws from categorical distributions.
Usage
entropy(x)
## Default S3 method:
entropy(x)
## S3 method for class 'rvar'
entropy(x)
Arguments
x |
(multiple options) A vector to be interpreted as draws from a categorical distribution, such as:
|
Details
Calculates the normalized Shannon entropy of the draws in x
. This value is
the entropy of x
divided by the maximum entropy of a distribution with n
categories, where n
is length(unique(x))
for numeric vectors and
length(levels(x))
for factors:
-\frac{\sum_{i = 1}^{n} p_i \log(p_i)}{\log(n)}
This scales the output to be between 0 (all probability in one category)
and 1 (uniform). This form of normalized entropy is referred to as
H_\mathrm{REL}
in Wilcox (1967).
Value
If x
is a factor or numeric, returns a length-1 numeric vector with a value
between 0 and 1 (inclusive) giving the normalized Shannon entropy of x
.
If x
is an rvar, returns an array of the same shape as x
, where each
cell is the normalized Shannon entropy of the draws in the corresponding cell of x
.
References
Allen R. Wilcox (1967). Indices of Qualitative Variation (No. ORNL-TM-1919). Oak Ridge National Lab., Tenn.
Examples
set.seed(1234)
levels <- c("a", "b", "c", "d", "e")
# a uniform distribution: high normalized entropy
x <- factor(
sample(levels, 4000, replace = TRUE, prob = c(0.2, 0.2, 0.2, 0.2, 0.2)),
levels = levels
)
entropy(x)
# a unimodal distribution: low normalized entropy
y <- factor(
sample(levels, 4000, replace = TRUE, prob = c(0.95, 0.02, 0.015, 0.01, 0.005)),
levels = levels
)
entropy(y)
# both together, as an rvar
xy <- c(rvar(x), rvar(y))
xy
entropy(xy)