entropy.shrink {entropy} | R Documentation |
Shrinkage Estimators of Entropy, Mutual Information and Related Quantities
Description
freq.shrink
estimates the bin frequencies from the counts y
using a James-Stein-type shrinkage estimator, where the shrinkage target is the uniform distribution.
entropy.shrink
estimates the Shannon entropy H of the random variable Y
from the corresponding observed counts y
by plug-in of shrinkage estimate
of the bin frequencies.
KL.shrink
computes a shrinkage estimate of the Kullback-Leibler (KL) divergence
from counts y1
and y2
.
chi2.shrink
computes a shrinkage version of the chi-squared divergence
from counts y1
and y2
.
mi.shrink
estimates a shrinkage estimate of mutual information of two random variables.
chi2indep.shrink
computes a shrinkage version of the chi-squared divergence of independence
from a table of counts y2d
.
Usage
freqs.shrink(y, lambda.freqs, verbose=TRUE)
entropy.shrink(y, lambda.freqs, unit=c("log", "log2", "log10"), verbose=TRUE)
KL.shrink(y1, y2, lambda.freqs1, lambda.freqs2, unit=c("log", "log2", "log10"),
verbose=TRUE)
chi2.shrink(y1, y2, lambda.freqs1, lambda.freqs2, unit=c("log", "log2", "log10"),
verbose=TRUE)
mi.shrink(y2d, lambda.freqs, unit=c("log", "log2", "log10"), verbose=TRUE)
chi2indep.shrink(y2d, lambda.freqs, unit=c("log", "log2", "log10"), verbose=TRUE)
Arguments
y |
vector of counts. |
y1 |
vector of counts. |
y2 |
vector of counts. |
y2d |
matrix of counts. |
unit |
the unit in which entropy is measured.
The default is "nats" (natural units). For
computing entropy in "bits" set |
lambda.freqs |
shrinkage intensity. If not specified (default) it is estimated in a James-Stein-type fashion. |
lambda.freqs1 |
shrinkage intensity for first random variable. If not specified (default) it is estimated in a James-Stein-type fashion. |
lambda.freqs2 |
shrinkage intensity for second random variable. If not specified (default) it is estimated in a James-Stein-type fashion. |
verbose |
report shrinkage intensity. |
Details
The shrinkage estimator is a James-Stein-type estimator. It is essentially
a entropy.Dirichlet
estimator, where the pseudocount is
estimated from the data.
For details see Hausser and Strimmer (2009).
Value
freqs.shrink
returns a shrinkage estimate of the frequencies.
entropy.shrink
returns a shrinkage estimate of the Shannon entropy.
KL.shrink
returns a shrinkage estimate of the KL divergence.
chi2.shrink
returns a shrinkage version of the chi-squared divergence.
mi.shrink
returns a shrinkage estimate of the mutual information.
chi2indep.shrink
returns a shrinkage version of the chi-squared divergence of independence.
In all instances the estimated shrinkage intensity is attached to the returned
value as attribute lambda.freqs
.
Author(s)
Korbinian Strimmer (https://strimmerlab.github.io).
References
Hausser, J., and K. Strimmer. 2009. Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks. J. Mach. Learn. Res. 10: 1469-1484. Available online from https://jmlr.csail.mit.edu/papers/v10/hausser09a.html.
See Also
entropy
, entropy.Dirichlet
,
entropy.plugin
,
KL.plugin
, mi.plugin
, discretize
.
Examples
# load entropy library
library("entropy")
# a single variable
# observed counts for each bin
y = c(4, 2, 3, 0, 2, 4, 0, 0, 2, 1, 1)
# shrinkage estimate of frequencies
freqs.shrink(y)
# shrinkage estimate of entropy
entropy.shrink(y)
# example with two variables
# observed counts for two random variables
y1 = c(4, 2, 3, 1, 10, 4)
y2 = c(2, 3, 7, 1, 4, 3)
# shrinkage estimate of Kullback-Leibler divergence
KL.shrink(y1, y2)
# half of the shrinkage chi-squared divergence
0.5*chi2.shrink(y1, y2)
## joint distribution example
# contingency table with counts for two discrete variables
y2d = rbind( c(1,2,3), c(6,5,4) )
# shrinkage estimate of mutual information
mi.shrink(y2d)
# half of the shrinkage chi-squared divergence of independence
0.5*chi2indep.shrink(y2d)