EntropyMCMC {EntropyMCMC}R Documentation

Kullback and entropy estimation from MCMC simulation output - single and multicore versions


These functions return estimates of the entropy of the density p^t of a MCMC algorithm at time t, E_{p^t}[\log(p^t)], and of the Kullback divergence between p^t and the target density, for t=1 up to the number of iterations that have been simulated. The MCMC simulations must be computed before or externally, and passed as a "plMCMC" object in the first argument (see details). The target may be known only up to a multiplicative constant (see details).

EntropyMCMC.mc is a parallel computing version that uses the parallel package to split the task between the available (virtual) cores on the computer. This version using socket cluster is not available for Windows computers.


EntropyMCMC(plmc1, method = "A.Nearest.Neighbor", k=1, trim = 0.02, eps=0, 
        all.f = TRUE, verb = FALSE, EntVect = FALSE,
        uselogtarget = FALSE, logtarget = NULL)

EntropyMCMC.mc(plmc1, method = "A.Nearest.Neighbor", k = 1, trim = 0.02, eps=0,
        all.f = TRUE, verb = FALSE, EntVect = FALSE, nbcores=detectCores(), 
		    uselogtarget = FALSE, logtarget = NULL)



an objects of class plMCMC (for parallel MCMC), like the output of MCMCcopies, which contains all the simulations plus target f definition and parameters.


The method for estimating the entropy E_{p^t}[\log(p^t)]. Methods currently implemented are : "NearestNeighbor" as in Kozachenko and Leonenko (1987), "k.NearestNeighbor" as in Leonenko et al. (2005), "A.Nearest.Neighbor" (the default) which is as "k.NearestNeighbor" but uses the RANN package for (Approximate) fast computation of nearest neighbors, "Gyorfi.trim" subsampling method as defined in Gyorfi and Vander Mulen (1989), plus a tuning parameter trim for trimming the data (see Chauveau and Vandekerkhove (2011)).


The k-nearest neighbor index, the default is k=1.


Parameter controlling the percentage of smallest data from one subsample that is removed, only for method = "Gyorfi.trim".


A parameter controlling precision in the "A.Nearest.Neighbor"" method, the default means no approximation, see the RANN package.


If TRUE (the default) relative entropy is computed over the whole sample. Should be removed in next version.


Verbose mode


If FALSE (the default), the entropy is computed only on the kth-nearest neighbor. If TRUE, the entropy is computed for all j-NN's for j=1 to k (the latter being mostly for testing purposes).


Number of required (virtual) cores, defaults to all as returned by detectCores().


Set to FALSE by default; useful in some cases where log(f(\theta)) returns -Inf values in Kullback computations because f(\theta) itself returns too small values for some \theta far from modal regions. In these case using a function computing the logarithm of the target can remove the infinity values.


The function defining log(f(theta)), NULL by default, required if uselogtarget equals TRUE. This option and uselogtarget are currently implemented only for the "A.Nearest.Neighbor" method, and for the default EntVect = FALSE option.


Methods based on Nearest Neighbors (NN) should be preferred since these require less tuning parameters. Some options, as uselogtarget are in testing phase and are not implemented in all the available methods (see Arguments).


An object of class KbMCMC (for Kullback MCMC), containing:


A vector of estimated divergences K(p^t,f), for t=1 up to the number of iterations that have been simulated. This is the convergence/comparison criterion.


A vector of estimated entropies E_{p^t}[\log(p^t)], for t=1 up to the number of iterations that have been simulated.


The number of iid copies of each single chain.


The state space dimension of the MCMC algorithm.


The name of the MCMC algorithm that have been used to simulate the copies of chains, see MCMCcopies.


The target density for which the MCMC algorithm is defined; ususally given only up to a multiplicative constant for MCMC in Bayesian models. target must be a function such as the multidimensional gaussian target_norm(x,param) with argument and parameters passed like in the example below.


The method input parameter (see above).


A list holding all the necessary target parameters, consistent with the target definition.


A list holding all the necessary parameters for the proposal density of the MCMC algorithm that have been used.


The method "Resubst" is implemented for testing, without theoretical guarantee of convergence.


Didier Chauveau, Houssam Alrachid.


See Also

MCMCcopies and MCMCcopies.mc for iid MCMC simulations (single core and multicore), EntropyParallel and EntropyParallel.cl for simultaneous simulation and entropy estimation (single core and multicore).


## Toy example using the bivariate gaussian target
## with default parameters value, see target_norm_param
n = 150; nmc = 50; d=2 # bivariate example
varq=0.1 # variance of the proposal (chosen too small)
## initial distribution, located in (2,2), "far" from target center (0,0)
Ptheta0 <- DrawInit(nmc, d, initpdf = "rnorm", mean = 2, sd = 1) 
# simulation of the nmc iid chains, singlecore 
s1 <- MCMCcopies(RWHM, n, nmc, Ptheta0, target_norm,
                 target_norm_param, q_param, verb = FALSE)
summary(s1) # method for "plMCMC" object
e1 <- EntropyMCMC(s1) # computes Entropy and Kullback divergence estimates
plot(e1) # default plot.plMCMC method, convergence after about 80 iterations
plot(e1, Kullback = FALSE) # Plot Entropy estimates over time
abline(normEntropy(target_norm_param), 0, col=8, lty=2) # true E_f[log(f)]

[Package EntropyMCMC version 1.0.4 Index]