mcmc_simple_step_size_adaptation {tfprobability} | R Documentation |
Adapts the inner kernel's step_size
based on log_accept_prob
.
Description
The simple policy multiplicatively increases or decreases the step_size
of
the inner kernel based on the value of log_accept_prob
. It is based on
equation 19 of Andrieu and Thoms (2008). Given enough steps and small
enough adaptation_rate
the median of the distribution of the acceptance
probability will converge to the target_accept_prob
. A good target
acceptance probability depends on the inner kernel. If this kernel is
HamiltonianMonteCarlo
, then 0.6-0.9 is a good range to aim for. For
RandomWalkMetropolis
this should be closer to 0.25. See the individual
kernels' docstrings for guidance.
Usage
mcmc_simple_step_size_adaptation(
inner_kernel,
num_adaptation_steps,
target_accept_prob = 0.75,
adaptation_rate = 0.01,
step_size_setter_fn = NULL,
step_size_getter_fn = NULL,
log_accept_prob_getter_fn = NULL,
validate_args = FALSE,
name = NULL
)
Arguments
inner_kernel |
|
num_adaptation_steps |
Scalar |
target_accept_prob |
A floating point |
adaptation_rate |
|
step_size_setter_fn |
A function with the signature
|
step_size_getter_fn |
A function with the signature
|
log_accept_prob_getter_fn |
A function with the signature
|
validate_args |
|
name |
string prefixed to Ops created by this class. Default: "simple_step_size_adaptation". |
Details
In general, adaptation prevents the chain from reaching a stationary
distribution, so obtaining consistent samples requires num_adaptation_steps
be set to a value somewhat smaller than the number of burnin steps.
However, it may sometimes be helpful to set num_adaptation_steps
to a larger
value during development in order to inspect the behavior of the chain during
adaptation.
The step size is assumed to broadcast with the chain state, potentially having
leading dimensions corresponding to multiple chains. When there are fewer of
those leading dimensions than there are chain dimensions, the corresponding
dimensions in the log_accept_prob
are averaged (in the direct space, rather
than the log space) before being used to adjust the step size. This means that
this kernel can do both cross-chain adaptation, or per-chain step size
adaptation, depending on the shape of the step size.
For example, if your problem has a state with shape [S]
, your chain state
has shape [C0, C1, Y]
(meaning that there are C0 * C1
total chains) and
log_accept_prob
has shape [C0, C1]
(one acceptance probability per chain),
then depending on the shape of the step size, the following will happen:
Step size has shape
[]
,[S]
or[1]
, thelog_accept_prob
will be averaged across itsC0
andC1
dimensions. This means that you will learn a shared step size based on the mean acceptance probability across all chains. This can be useful if you don't have a lot of steps to adapt and want to average away the noise.Step size has shape
[C1, 1]
or[C1, S]
, thelog_accept_prob
will be averaged across itsC0
dimension. This means that you will learn a shared step size based on the mean acceptance probability across chains that share the coordinate across theC1
dimension. This can be useful when theC1
dimension indexes different distributions, whileC0
indexes replicas of a single distribution, all sampled in parallel.Step size has shape
[C0, C1, 1]
or[C0, C1, S]
, then no averaging will happen. This means that each chain will learn its own step size. This can be useful when all chains are sampling from different distributions. Even when all chains are for the same distribution, this can help during the initial warmup period.Step size has shape
[C0, 1, 1]
or[C0, 1, S]
, thelog_accept_prob
will be averaged across itsC1
dimension. This means that you will learn a shared step size based on the mean acceptance probability across chains that share the coordinate across theC0
dimension. This can be useful when theC0
dimension indexes different distributions, whileC1
indexes replicas of a single distribution, all sampled in parallel.
Value
a Monte Carlo sampling kernel
References
See Also
Other mcmc_kernels:
mcmc_dual_averaging_step_size_adaptation()
,
mcmc_hamiltonian_monte_carlo()
,
mcmc_metropolis_adjusted_langevin_algorithm()
,
mcmc_metropolis_hastings()
,
mcmc_no_u_turn_sampler()
,
mcmc_random_walk_metropolis()
,
mcmc_replica_exchange_mc()
,
mcmc_slice_sampler()
,
mcmc_transformed_transition_kernel()
,
mcmc_uncalibrated_hamiltonian_monte_carlo()
,
mcmc_uncalibrated_langevin()
,
mcmc_uncalibrated_random_walk()
Examples
target_log_prob_fn <- tfd_normal(loc = 0, scale = 1)$log_prob
num_burnin_steps <- 500
num_results <- 500
num_chains <- 64L
step_size <- tf$fill(list(num_chains), 0.1)
kernel <- mcmc_hamiltonian_monte_carlo(
target_log_prob_fn = target_log_prob_fn,
num_leapfrog_steps = 2,
step_size = step_size
) %>%
mcmc_simple_step_size_adaptation(num_adaptation_steps = round(num_burnin_steps * 0.8))
res <- kernel %>% mcmc_sample_chain(
num_results = num_results,
num_burnin_steps = num_burnin_steps,
current_state = rep(0, num_chains),
trace_fn = function(x, pkr) {
list (
pkr$inner_results$accepted_results$step_size,
pkr$inner_results$log_accept_ratio
)
}
)
samples <- res$all_states
step_size <- res$trace[[1]]
log_accept_ratio <- res$trace[[2]]