EntropyParallel {EntropyMCMC} | R Documentation |
Parallel simulation and Entropy estimation of MCMC's - single core and cluster versions
Description
This function simulates “parallel chains” (iid copies)
of a MCMC algorithm, i.e.
for each “time” iteration t
the next step of all the nmc
chains are generated,
then the Entropy of the density p^t
of the algorithm at iteration
t
,
E_{p^t}[\log(p^t)]
,
and the Kullback divergence between p^t
and the target density
are estimated, based on these nmc
steps iid from p^t
.
By default keep.all = FALSE
i.e. the past of the parallel chains
is discarded so that the amount of memory requirement is kept small, and
only entropy-related estimates are returned.
If keep.all = TRUE
, the entire set of chains trajectories
is saved in an array of dimensions (n,d,nmc)
, such as the one
returned by MCMCcopies
or MCMCcopies.cl
.
A version of this function implementing several HPC (parallel) computing strategies is available (see details).
Usage
EntropyParallel(mcmc_algo, n = 100, nmc = 10, Ptheta0, target, f_param, q_param,
method = "A.Nearest.Neighbor",k=1, trim = 0.02, keep.all = FALSE,
verb = TRUE, EntVect = FALSE)
EntropyParallel.cl(mcmc_algo, n = 100, nmc = 10, Ptheta0, target, f_param, q_param,
method = "A.Nearest.Neighbor",k=1, eps = 0, trim=0.02,
verb=TRUE, EntVect = FALSE, cltype="PAR_SOCK", nbnodes = 4,
par.logf = FALSE, uselogtarget = FALSE, logtarget = NULL)
Arguments
mcmc_algo |
a list defining an MCMC algorithm in terms of the
functions it uses, such as |
n |
The number of (time) iterations of each single chain to run. |
nmc |
The number of iid copies of each single chain. |
Ptheta0 |
A |
target |
The target density for which the MCMC algorithm is defined;
may be given only up to a multiplicative constant for most MCMC.
target must be a function such as the multidimensional gaussian
|
f_param |
A list holding all the necessary target parameters, including the data in an actual Bayesian model, and consistent with the target definition. |
q_param |
A list holding all the necessary parameters
for the proposal density of the MCMC algorithm |
method |
The method for estimating the entropy
|
k |
The k-nearest neighbor index, the default is |
eps |
Error bound: default of 0.0 implies exact nearest neighbour search, see the RANN package. |
trim |
not used in this implementation, only for |
keep.all |
If |
verb |
Verbose mode for summarizing output during the simulation. |
EntVect |
If |
cltype |
Character string specifying the type of cluster;
currently implemented
types are: "PAR_SOCK" for socket cluster with |
nbnodes |
The number of nodes or virtual cores requested to run the |
par.logf |
if |
uselogtarget |
Set to |
logtarget |
The function defining |
Details
About parallel computing:
For the HPC (parallel) version, the computation of the nmc
chains next step
are done by the cluster nodes:
EntropyParallel.cl
is a generic cluster version implementing
several types of cluster for running on a single, multicore computer
or on a true cluster using MPI communications. It is under development and may not
work on all platform/OS. For instance the parallel socket cluster version
does not work on Windows machines (see the parallel package documentation).
Currently tested under LINUX, Mac OSX, and a cluster
using OpenMPI and Sun Grid Engine.
Note that the parallel computing for this approach is less efficient
than the two-steps procedure consisting in
(i) parallel simulation of the iid chains using MCMCcopies.cl
to generate the
“cube” of simulated values,
and then (ii) entropy and Kullback estimation using EntropyMCMC.mc
.
This is because each node computes only one iteration at a time for the nmc
chains
here, whereas it computes all the n
iterations once for the nmc
chains
when the entire cube is saved first. This is a trade-off between memory and speed.
Note also that the Rmpi
option is less efficient than the default option
using parallel if you are running on a single computer.
MPI communication are required only for running on a true cluster/grid.
About passing your MCMC algorithm:
The list mcmc_algo
must contain the named elements:
-
name
, the name of the MCMC, such as "RWHM" -
chain
, the function for simulation of n steps of a single chain -
step
, the function for simulation of 1 step of that algorithm -
q_pdf
, the density of the proposal -
q_proposal
, the function that simulates a proposal
For examples, see the algorithms currently implemented:
RWHM
, the Random Walk Hasting-Metropolis with gaussian proposal;
HMIS_norm
, an Independence Sampler HM with gaussian proposal;
IID_norm
, a gaussian iid sampler which is merely
a "fake" MCMC for testing purposes.
Currently only non-adaptive Markov chains or adaptive chains for which
the past can be summarized by some sufficient statistics are eligible for this
computation forgetting the past of the nmc
chains.
Adaptive chains such as AMHaario
, the Adaptive-Metropolis (AM) from Haario (2001) are
not yet tested for this function.
Value
An object of class "KbMCMC"
, containing
Kullback |
A vector of estimated |
Entp |
A vector of estimated |
nmc |
The number of iid copies of each single chain. |
dim |
The state space dimension of the MCMC algorithm. |
algo |
The name of the MCMC algorithm that have been used to simulate
the copies of chains, see |
target |
The target density for which the MCMC algorithm is defined;
may be given only up to a multiplicative constant for most MCMC.
target must be a function such as the multidimensional gaussian
|
method |
The |
f_param |
A list holding all the necessary target parameters, consistent with the target definition. |
q_param |
A list holding all the necessary parameters for the proposal density of the MCMC algorithm that have been used. |
prob.accept |
Estimated rate of acceptation (meaningful for accept/reject-type algorithms). |
Ptheta |
The |
Author(s)
Didier Chauveau, Houssam Alrachid.
References
Chauveau, D. and Vandekerkhove, P. (2013), Smoothness of Metropolis-Hastings algorithm and application to entropy estimation. ESAIM: Probability and Statistics, 17, 419–431. DOI: http://dx.doi.org/10.1051/ps/2012004
Chauveau D. and Vandekerkhove, P. (2014), Simulation Based Nearest Neighbor Entropy Estimation for (Adaptive) MCMC Evaluation, In JSM Proceedings, Statistical Computing Section. Alexandria, VA: American Statistical Association. 2816–2827.
Chauveau D. and Vandekerkhove, P. (2014), The Nearest Neighbor entropy estimate: an adequate tool for adaptive MCMC evaluation. Preprint HAL http://hal.archives-ouvertes.fr/hal-01068081.
See Also
MCMCcopies
, MCMCcopies.mc
and
MCMCcopies.cl
for just simulating the iid chains, and
EntropyMCMC
or EntropyMCMC.mc
for computing entropy and Kullback estimates from an already simulated
set of iid chains (internally or from external code).
Examples
## Toy example using the bivariate gaussian target
## same as for MCMCcopies
n = 150; nmc = 50; d=2 # bivariate example
varq=0.1 # variance of the proposal (chosen too small)
q_param=list(mean=rep(0,d),v=varq*diag(d))
## initial distribution, located in (2,2), "far" from target center (0,0)
Ptheta0 <- DrawInit(nmc, d, initpdf = "rnorm", mean = 2, sd = 1)
# simulations and entropy + Kullback using the singlecore version
e1 <- EntropyParallel(RWHM, n, nmc, Ptheta0, target_norm,
target_norm_param, q_param, verb = FALSE)
par(mfrow=c(1,2))
plot(e1) # default plot.plMCMC method, convergence after about 80 iterations
plot(e1, Kullback = FALSE) # Plot Entropy estimates over time
abline(normEntropy(target_norm_param), 0, col=8, lty=2) # true E_f[log(f)]
# Another example using multicore version, (not available on Windows)
varq=0.05 # variance of the proposal, even smaller
q_param=list(mean=rep(0,d),v=varq*diag(d))
n=300 # requires more iterations to show convergence
e1 <- EntropyParallel.cl(RWHM, n, nmc, Ptheta0, target_norm,
target_norm_param, q_param, cltype="PAR_SOCK",
verb = FALSE, nbnodes = 2)
plot(e1) # convergence after about 150 iterations