KLentropy {IndepTest}R Documentation

KLentropy

Description

Calculates the (weighted) Kozachenko–Leonenko entropy estimator studied in Berrett, Samworth and Yuan (2018), which is based on the k-nearest neighbour distances of the sample.

Usage

KLentropy(x, k, weights = FALSE, stderror = FALSE)

Arguments

x

The n \times d data matrix.

k

The tuning parameter that gives the maximum number of neighbours that will be considered by the estimator.

weights

Specifies whether a weighted or unweighted estimator is used. If a weighted estimator is to be used then the default (weights=TRUE) results in the weights being calculated by L2OptW, otherwise the user may specify their own weights.

stderror

Specifies whether an estimate of the standard error of the weighted estimate is calculated. The calculation is done using an unweighted version of the variance estimator described on page 7 of Berrett, Samworth and Yuan (2018).

Value

The first element of the list is the unweighted estimator for the value of 1 up to the user-specified k. The second element of the list is the weighted estimator, obtained by taking the inner product between the first element of the list and the weight vector. If stderror=TRUE the third element of the list is an estimate of the standard error of the weighted estimate.

References

Berrett, T. B., Samworth, R. J. and Yuan, M. (2018). “Efficient multivariate entropy estimation via k-nearest neighbour distances.” Annals of Statistics, to appear.

Examples

n=1000; x=rnorm(n); KLentropy(x,30,stderror=TRUE)   # The true value is 0.5*log(2*pi*exp(1)) = 1.42.
n=5000; x=matrix(rnorm(4*n),ncol=4)                 # The true value is 2*log(2*pi*exp(1)) = 5.68
KLentropy(x,30,weights=FALSE)                       # Unweighted estimator
KLentropy(x,30,weights=TRUE)                        # Weights chosen by L2OptW
w=runif(30); w=w/sum(w); KLentropy(x,30,weights=w)  # User-specified weights


[Package IndepTest version 0.2.0 Index]