R: Shrinkage Estimators of Entropy, Mutual Information and...

entropy.shrink {entropy}

R Documentation

Shrinkage Estimators of Entropy, Mutual Information and Related Quantities

Description

freq.shrink estimates the bin frequencies from the counts y using a James-Stein-type shrinkage estimator, where the shrinkage target is the uniform distribution.

entropy.shrink estimates the Shannon entropy H of the random variable Y from the corresponding observed counts y by plug-in of shrinkage estimate of the bin frequencies.

KL.shrink computes a shrinkage estimate of the Kullback-Leibler (KL) divergence from counts y1 and y2.

chi2.shrink computes a shrinkage version of the chi-squared divergence from counts y1 and y2.

mi.shrink estimates a shrinkage estimate of mutual information of two random variables.

chi2indep.shrink computes a shrinkage version of the chi-squared divergence of independence from a table of counts y2d.

Usage

freqs.shrink(y, lambda.freqs, verbose=TRUE)
entropy.shrink(y, lambda.freqs, unit=c("log", "log2", "log10"), verbose=TRUE)
KL.shrink(y1, y2, lambda.freqs1, lambda.freqs2, unit=c("log", "log2", "log10"),
            verbose=TRUE)
chi2.shrink(y1, y2, lambda.freqs1, lambda.freqs2, unit=c("log", "log2", "log10"),
            verbose=TRUE)
mi.shrink(y2d, lambda.freqs, unit=c("log", "log2", "log10"), verbose=TRUE)
chi2indep.shrink(y2d, lambda.freqs, unit=c("log", "log2", "log10"), verbose=TRUE)

Arguments

`y`	vector of counts.
`y1`	vector of counts.
`y2`	vector of counts.
`y2d`	matrix of counts.
`unit`	the unit in which entropy is measured. The default is "nats" (natural units). For computing entropy in "bits" set `unit="log2"`.
`lambda.freqs`	shrinkage intensity. If not specified (default) it is estimated in a James-Stein-type fashion.
`lambda.freqs1`	shrinkage intensity for first random variable. If not specified (default) it is estimated in a James-Stein-type fashion.
`lambda.freqs2`	shrinkage intensity for second random variable. If not specified (default) it is estimated in a James-Stein-type fashion.
`verbose`	report shrinkage intensity.

Details

The shrinkage estimator is a James-Stein-type estimator. It is essentially a entropy.Dirichlet estimator, where the pseudocount is estimated from the data.

For details see Hausser and Strimmer (2009).

Value

freqs.shrink returns a shrinkage estimate of the frequencies.

entropy.shrink returns a shrinkage estimate of the Shannon entropy.

KL.shrink returns a shrinkage estimate of the KL divergence.

chi2.shrink returns a shrinkage version of the chi-squared divergence.

mi.shrink returns a shrinkage estimate of the mutual information.

chi2indep.shrink returns a shrinkage version of the chi-squared divergence of independence.

In all instances the estimated shrinkage intensity is attached to the returned value as attribute lambda.freqs.

Author(s)

Korbinian Strimmer (https://strimmerlab.github.io).

References

Hausser, J., and K. Strimmer. 2009. Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks. J. Mach. Learn. Res. 10: 1469-1484. Available online from https://jmlr.csail.mit.edu/papers/v10/hausser09a.html.

Examples

# load entropy library 
library("entropy")

# a single variable

# observed counts for each bin
y = c(4, 2, 3, 0, 2, 4, 0, 0, 2, 1, 1)  

# shrinkage estimate of frequencies
freqs.shrink(y)

# shrinkage estimate of entropy
entropy.shrink(y)


# example with two variables

# observed counts for two random variables
y1 = c(4, 2, 3, 1, 10, 4)
y2 = c(2, 3, 7, 1, 4, 3)

# shrinkage estimate of Kullback-Leibler divergence
KL.shrink(y1, y2)

# half of the shrinkage chi-squared divergence
0.5*chi2.shrink(y1, y2)


## joint distribution example

# contingency table with counts for two discrete variables
y2d = rbind( c(1,2,3), c(6,5,4) )

# shrinkage estimate of mutual information
mi.shrink(y2d)

# half of the shrinkage chi-squared divergence of independence
0.5*chi2indep.shrink(y2d)

[Package entropy version 1.3.1 Index]