R: Kullback and entropy estimation from MCMC simulation output -...

EntropyMCMC {EntropyMCMC}

R Documentation

Kullback and entropy estimation from MCMC simulation output - single and multicore versions

Description

These functions return estimates of the entropy of the density p^t of a MCMC algorithm at time t, E_{p^t}[\log(p^t)], and of the Kullback divergence between p^t and the target density, for t=1 up to the number of iterations that have been simulated. The MCMC simulations must be computed before or externally, and passed as a "plMCMC" object in the first argument (see details). The target may be known only up to a multiplicative constant (see details).

EntropyMCMC.mc is a parallel computing version that uses the parallel package to split the task between the available (virtual) cores on the computer. This version using socket cluster is not available for Windows computers.

Usage

EntropyMCMC(plmc1, method = "A.Nearest.Neighbor", k=1, trim = 0.02, eps=0, 
        all.f = TRUE, verb = FALSE, EntVect = FALSE,
        uselogtarget = FALSE, logtarget = NULL)

EntropyMCMC.mc(plmc1, method = "A.Nearest.Neighbor", k = 1, trim = 0.02, eps=0,
        all.f = TRUE, verb = FALSE, EntVect = FALSE, nbcores=detectCores(), 
		    uselogtarget = FALSE, logtarget = NULL)

Arguments

`plmc1`	an objects of class `plMCMC` (for parallel MCMC), like the output of `MCMCcopies`, which contains all the simulations plus target `f` definition and parameters.
`method`	The method for estimating the entropy `E_{p^t}[\log(p^t)]`. Methods currently implemented are : `"NearestNeighbor"` as in Kozachenko and Leonenko (1987), `"k.NearestNeighbor"` as in Leonenko et al. (2005), `"A.Nearest.Neighbor"` (the default) which is as `"k.NearestNeighbor"` but uses the RANN package for (Approximate) fast computation of nearest neighbors, `"Gyorfi.trim"` subsampling method as defined in Gyorfi and Vander Mulen (1989), plus a tuning parameter `trim` for trimming the data (see Chauveau and Vandekerkhove (2011)).
`k`	The k-nearest neighbor index, the default is `k=1`.
`trim`	Parameter controlling the percentage of smallest data from one subsample that is removed, only for `method = "Gyorfi.trim"`.
`eps`	A parameter controlling precision in the `"A.Nearest.Neighbor"`" method, the default means no approximation, see the RANN package.
`all.f`	If `TRUE` (the default) relative entropy is computed over the whole sample. Should be removed in next version.
`verb`	Verbose mode
`EntVect`	If `FALSE` (the default), the entropy is computed only on the kth-nearest neighbor. If `TRUE`, the entropy is computed for all j-NN's for `j=1` to `k` (the latter being mostly for testing purposes).
`nbcores`	Number of required (virtual) cores, defaults to all as returned by `detectCores()`.
`uselogtarget`	Set to `FALSE` by default; useful in some cases where `log(f(\theta))` returns `-Inf` values in Kullback computations because `f(\theta)` itself returns too small values for some `\theta` far from modal regions. In these case using a function computing the logarithm of the target can remove the infinity values.
`logtarget`	The function defining `log(f(theta))`, `NULL` by default, required if `uselogtarget` equals `TRUE`. This option and `uselogtarget` are currently implemented only for the `"A.Nearest.Neighbor"` method, and for the default `EntVect = FALSE` option.

Details

Methods based on Nearest Neighbors (NN) should be preferred since these require less tuning parameters. Some options, as uselogtarget are in testing phase and are not implemented in all the available methods (see Arguments).

Value

An object of class KbMCMC (for Kullback MCMC), containing:

`Kullback`	A vector of estimated divergences `K(p^t,f)`, for `t=1` up to the number of iterations that have been simulated. This is the convergence/comparison criterion.
`Entp`	A vector of estimated entropies `E_{p^t}[\log(p^t)]`, for `t=1` up to the number of iterations that have been simulated.
`nmc`	The number of iid copies of each single chain.
`dim`	The state space dimension of the MCMC algorithm.
`algo`	The name of the MCMC algorithm that have been used to simulate the copies of chains, see `MCMCcopies`.
`target`	The target density for which the MCMC algorithm is defined; ususally given only up to a multiplicative constant for MCMC in Bayesian models. target must be a function such as the multidimensional gaussian `target_norm(x,param)` with argument and parameters passed like in the example below.
`method`	The `method` input parameter (see above).
`f_param`	A list holding all the necessary target parameters, consistent with the target definition.
`q_param`	A list holding all the necessary parameters for the proposal density of the MCMC algorithm that have been used.

Note

The method "Resubst" is implemented for testing, without theoretical guarantee of convergence.

Author(s)

Didier Chauveau, Houssam Alrachid.

References

Chauveau, D. and Vandekerkhove, P. (2013), Smoothness of Metropolis-Hastings algorithm and application to entropy estimation. ESAIM: Probability and Statistics, 17, 419–431. DOI: http://dx.doi.org/10.1051/ps/2012004
Chauveau D. and Vandekerkhove, P. (2014), Simulation Based Nearest Neighbor Entropy Estimation for (Adaptive) MCMC Evaluation, In JSM Proceedings, Statistical Computing Section. Alexandria, VA: American Statistical Association. 2816–2827.
Chauveau D. and Vandekerkhove, P. (2014), The Nearest Neighbor entropy estimate: an adequate tool for adaptive MCMC evaluation. Preprint HAL http://hal.archives-ouvertes.fr/hal-01068081.

Examples

## Toy example using the bivariate gaussian target
## with default parameters value, see target_norm_param
n = 150; nmc = 50; d=2 # bivariate example
varq=0.1 # variance of the proposal (chosen too small)
q_param=list(mean=rep(0,d),v=varq*diag(d))
## initial distribution, located in (2,2), "far" from target center (0,0)
Ptheta0 <- DrawInit(nmc, d, initpdf = "rnorm", mean = 2, sd = 1) 
# simulation of the nmc iid chains, singlecore 
s1 <- MCMCcopies(RWHM, n, nmc, Ptheta0, target_norm,
                 target_norm_param, q_param, verb = FALSE)
summary(s1) # method for "plMCMC" object
e1 <- EntropyMCMC(s1) # computes Entropy and Kullback divergence estimates
par(mfrow=c(1,2))
plot(e1) # default plot.plMCMC method, convergence after about 80 iterations
plot(e1, Kullback = FALSE) # Plot Entropy estimates over time
abline(normEntropy(target_norm_param), 0, col=8, lty=2) # true E_f[log(f)]

[Package EntropyMCMC version 1.0.4 Index]