entropy.shrink {entropy} R Documentation

## Shrinkage Estimators of Entropy, Mutual Information and Related Quantities

### Description

freq.shrink estimates the bin frequencies from the counts y using a James-Stein-type shrinkage estimator, where the shrinkage target is the uniform distribution.

entropy.shrink estimates the Shannon entropy H of the random variable Y from the corresponding observed counts y by plug-in of shrinkage estimate of the bin frequencies.

KL.shrink computes a shrinkage estimate of the Kullback-Leibler (KL) divergence from counts y1 and y2.

chi2.shrink computes a shrinkage version of the chi-squared divergence from counts y1 and y2.

mi.shrink estimates a shrinkage estimate of mutual information of two random variables.

chi2indep.shrink computes a shrinkage version of the chi-squared divergence of independence from a table of counts y2d.

### Usage

freqs.shrink(y, lambda.freqs, verbose=TRUE)
entropy.shrink(y, lambda.freqs, unit=c("log", "log2", "log10"), verbose=TRUE)
KL.shrink(y1, y2, lambda.freqs1, lambda.freqs2, unit=c("log", "log2", "log10"),
verbose=TRUE)
chi2.shrink(y1, y2, lambda.freqs1, lambda.freqs2, unit=c("log", "log2", "log10"),
verbose=TRUE)
mi.shrink(y2d, lambda.freqs, unit=c("log", "log2", "log10"), verbose=TRUE)
chi2indep.shrink(y2d, lambda.freqs, unit=c("log", "log2", "log10"), verbose=TRUE)


### Arguments

 y vector of counts. y1 vector of counts. y2 vector of counts. y2d matrix of counts. unit the unit in which entropy is measured. The default is "nats" (natural units). For computing entropy in "bits" set unit="log2". lambda.freqs shrinkage intensity. If not specified (default) it is estimated in a James-Stein-type fashion. lambda.freqs1 shrinkage intensity for first random variable. If not specified (default) it is estimated in a James-Stein-type fashion. lambda.freqs2 shrinkage intensity for second random variable. If not specified (default) it is estimated in a James-Stein-type fashion. verbose report shrinkage intensity.

### Details

The shrinkage estimator is a James-Stein-type estimator. It is essentially a entropy.Dirichlet estimator, where the pseudocount is estimated from the data.

For details see Hausser and Strimmer (2009).

### Value

freqs.shrink returns a shrinkage estimate of the frequencies.

entropy.shrink returns a shrinkage estimate of the Shannon entropy.

KL.shrink returns a shrinkage estimate of the KL divergence.

chi2.shrink returns a shrinkage version of the chi-squared divergence.

mi.shrink returns a shrinkage estimate of the mutual information.

chi2indep.shrink returns a shrinkage version of the chi-squared divergence of independence.

In all instances the estimated shrinkage intensity is attached to the returned value as attribute lambda.freqs.

### Author(s)

Korbinian Strimmer (https://strimmerlab.github.io).

### References

Hausser, J., and K. Strimmer. 2009. Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks. J. Mach. Learn. Res. 10: 1469-1484. Available online from https://jmlr.csail.mit.edu/papers/v10/hausser09a.html.

entropy, entropy.Dirichlet, entropy.plugin, KL.plugin, mi.plugin, discretize.

### Examples

# load entropy library
library("entropy")

# a single variable

# observed counts for each bin
y = c(4, 2, 3, 0, 2, 4, 0, 0, 2, 1, 1)

# shrinkage estimate of frequencies
freqs.shrink(y)

# shrinkage estimate of entropy
entropy.shrink(y)

# example with two variables

# observed counts for two random variables
y1 = c(4, 2, 3, 1, 10, 4)
y2 = c(2, 3, 7, 1, 4, 3)

# shrinkage estimate of Kullback-Leibler divergence
KL.shrink(y1, y2)

# half of the shrinkage chi-squared divergence
0.5*chi2.shrink(y1, y2)

## joint distribution example

# contingency table with counts for two discrete variables
y2d = rbind( c(1,2,3), c(6,5,4) )

# shrinkage estimate of mutual information
mi.shrink(y2d)

# half of the shrinkage chi-squared divergence of independence
0.5*chi2indep.shrink(y2d)



[Package entropy version 1.3.1 Index]