entropy.shrink {entropy}R Documentation

Shrinkage Estimators of Entropy, Mutual Information and Related Quantities

Description

freq.shrink estimates the bin frequencies from the counts y using a James-Stein-type shrinkage estimator, where the shrinkage target is the uniform distribution.

entropy.shrink estimates the Shannon entropy H of the random variable Y from the corresponding observed counts y by plug-in of shrinkage estimate of the bin frequencies.

KL.shrink computes a shrinkage estimate of the Kullback-Leibler (KL) divergence from counts y1 and y2.

chi2.shrink computes a shrinkage version of the chi-squared divergence from counts y1 and y2.

mi.shrink estimates a shrinkage estimate of mutual information of two random variables.

chi2indep.shrink computes a shrinkage version of the chi-squared divergence of independence from a table of counts y2d.

Usage

freqs.shrink(y, lambda.freqs, verbose=TRUE)
entropy.shrink(y, lambda.freqs, unit=c("log", "log2", "log10"), verbose=TRUE)
KL.shrink(y1, y2, lambda.freqs1, lambda.freqs2, unit=c("log", "log2", "log10"),
            verbose=TRUE)
chi2.shrink(y1, y2, lambda.freqs1, lambda.freqs2, unit=c("log", "log2", "log10"),
            verbose=TRUE)
mi.shrink(y2d, lambda.freqs, unit=c("log", "log2", "log10"), verbose=TRUE)
chi2indep.shrink(y2d, lambda.freqs, unit=c("log", "log2", "log10"), verbose=TRUE)

Arguments

y

vector of counts.

y1

vector of counts.

y2

vector of counts.

y2d

matrix of counts.

unit

the unit in which entropy is measured. The default is "nats" (natural units). For computing entropy in "bits" set unit="log2".

lambda.freqs

shrinkage intensity. If not specified (default) it is estimated in a James-Stein-type fashion.

lambda.freqs1

shrinkage intensity for first random variable. If not specified (default) it is estimated in a James-Stein-type fashion.

lambda.freqs2

shrinkage intensity for second random variable. If not specified (default) it is estimated in a James-Stein-type fashion.

verbose

report shrinkage intensity.

Details

The shrinkage estimator is a James-Stein-type estimator. It is essentially a entropy.Dirichlet estimator, where the pseudocount is estimated from the data.

For details see Hausser and Strimmer (2009).

Value

freqs.shrink returns a shrinkage estimate of the frequencies.

entropy.shrink returns a shrinkage estimate of the Shannon entropy.

KL.shrink returns a shrinkage estimate of the KL divergence.

chi2.shrink returns a shrinkage version of the chi-squared divergence.

mi.shrink returns a shrinkage estimate of the mutual information.

chi2indep.shrink returns a shrinkage version of the chi-squared divergence of independence.

In all instances the estimated shrinkage intensity is attached to the returned value as attribute lambda.freqs.

Author(s)

Korbinian Strimmer (http://www.strimmerlab.org).

References

Hausser, J., and K. Strimmer. 2009. Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks. J. Mach. Learn. Res. 10: 1469-1484. Available online from https://jmlr.csail.mit.edu/papers/v10/hausser09a.html.

See Also

entropy, entropy.Dirichlet, entropy.plugin, KL.plugin, mi.plugin, discretize.

Examples

# load entropy library 
library("entropy")

# a single variable

# observed counts for each bin
y = c(4, 2, 3, 0, 2, 4, 0, 0, 2, 1, 1)  

# shrinkage estimate of frequencies
freqs.shrink(y)

# shrinkage estimate of entropy
entropy.shrink(y)


# example with two variables

# observed counts for two random variables
y1 = c(4, 2, 3, 1, 10, 4)
y2 = c(2, 3, 7, 1, 4, 3)

# shrinkage estimate of Kullback-Leibler divergence
KL.shrink(y1, y2)

# half of the shrinkage chi-squared divergence
0.5*chi2.shrink(y1, y2)


## joint distribution example

# contingency table with counts for two discrete variables
y2d = rbind( c(1,2,3), c(6,5,4) )

# shrinkage estimate of mutual information
mi.shrink(y2d)

# half of the shrinkage chi-squared divergence of independence
0.5*chi2indep.shrink(y2d)



[Package entropy version 1.3.0 Index]