KL.plugin {entropy} | R Documentation |
Plug-In Estimator of the Kullback-Leibler divergence and of the Chi-Squared Divergence
Description
KL.plugin
computes the Kullback-Leiber (KL) divergence between two discrete random variables x_1
and x_2
. The corresponding probability mass functions are given by freqs1
and freqs2
. Note that the expectation is taken with regard to x_1
using freqs1
.
chi2.plugin
computes the chi-squared divergence between two discrete random variables x_1
and x_2
with freqs1
and freqs2
as corresponding probability mass functions. Note that the denominator contains freqs2
.
Usage
KL.plugin(freqs1, freqs2, unit=c("log", "log2", "log10"))
chi2.plugin(freqs1, freqs2, unit=c("log", "log2", "log10"))
Arguments
freqs1 |
frequencies (probability mass function) for variable |
freqs2 |
frequencies (probability mass function) for variable |
unit |
the unit in which entropy is measured.
The default is "nats" (natural units). For
computing entropy in "bits" set |
Details
Kullback-Leibler divergence between the two discrete variables x_1
to x_2
is \sum_k p_1(k) \log (p_1(k)/p_2(k))
where p_1
and p_2
are the probability mass functions of x_1
and x_2
, respectively, and k
is
the index for the classes.
The chi-squared divergence is given by \sum_k (p_1(k)-p_2(k))^2/p_2(k)
.
Note that both the KL divergence and the chi-squared divergence are not symmetric
in x_1
and x_2
. The chi-squared divergence can be derived as a
quadratic approximation of twice the KL divergence.
Value
KL.plugin
returns the KL divergence.
chi2.plugin
returns the chi-squared divergence.
Author(s)
Korbinian Strimmer (https://strimmerlab.github.io).
See Also
KL.Dirichlet
, KL.shrink
, KL.empirical
, mi.plugin
, discretize2d
.
Examples
# load entropy library
library("entropy")
# probabilities for two random variables
freqs1 = c(1/5, 1/5, 3/5)
freqs2 = c(1/10, 4/10, 1/2)
# KL divergence between x1 to x2
KL.plugin(freqs1, freqs2)
# and corresponding (half) chi-squared divergence
0.5*chi2.plugin(freqs1, freqs2)
## relationship to Pearson chi-squared statistic
# Pearson chi-squared statistic and p-value
n = 30 # sample size (observed counts)
chisq.test(n*freqs1, p = freqs2) # built-in function
# Pearson chi-squared statistic from Pearson divergence
pcs.stat = n*chi2.plugin(freqs1, freqs2) # note factor n
pcs.stat
# and p-value
df = length(freqs1)-1 # degrees of freedom
pcs.pval = 1-pchisq(pcs.stat, df)
pcs.pval