Amwg {overture} | R Documentation |
Turn a non-adaptive Metropolis sampler into an adaptive Metropolis sampler
Description
Given a non-adaptive sampler of the form f(..., s), Amwg
will return a
function g(...) that automatically adapts the Metropolis proposal standard
deviation s to try and achieve a target acceptance rate.
Usage
Amwg(f, s, batch.size = 50, target = 0.44, DeltaN, stop.after = NA)
Arguments
f |
non-adaptive Metropolis sampler of the form f(..., s) |
s |
initial value for the Metropolis proposal SD |
batch.size |
number of iterations before proposal SD is adapted |
target |
target acceptance rate |
DeltaN |
function of the form f(n) which returns the adaptation amount
based on the number of elapsed iterations, n. Defaults to |
stop.after |
stop adapting proposal SD after this many iterations |
Details
Amwg
uses the Adaptive Metropolis-Within-Gibbs algorithm from Roberts
& Rosenthal (2009), which re-scales the proposal standard deviation after a
fixed number of MCMC iterations have elapsed. The goal of the algorithm is
to achieve a target acceptance rate for the Metropolis step. After the
nth batch of MCMC iterations the log of the proposal standard
deviation, log(s)
, is increased/decreased by \delta(n)
.
log(s)
is increased by \delta(n)
if the observed acceptance rate
is more than the target acceptance rate, or decreased by \delta(n)
if
the observed acceptance rate is less than the target acceptance rate.
Amwg
keeps track of the the acceptance rate by comparing the
previously sampled value from f
to the next value. If the two values
are equal, the proposal is considered to be rejected, whereas if the two
values are different the proposal is considered accepted. Amwg
will
optionally stop adapting the proposal standard deviation after
stop.after
iterations. Setting stop.after
can be used, for
example, to stop adapting the proposal standard deviation after some burn-in
period. If stop.after=NA
(the default), Amwg
will continue to
modify the proposal standard deviation throughout the entire MCMC.
DeltaN
is set to \delta(n) = min(0.01, n^{-1/2})
unless
re-specified in the function call. Some care should be taken if re-specifying
DeltaN
, as the ergodicity of the chain may not be preserved if certain
conditions aren't met. See Roberts & Rosenthal (2009) in the references for
details.
The proposal standard deviation s
can be either a vector or a scalar.
If the initial value of s
is a scalar, f
will be treated as a
sampler for a scalar, a random vector, or a joint parameter update.
Alternatively, if the dimension of s
is equal to the dimension of the
parameters returned by f
, the individual elements s
will be
treated as individual proposal standard deviations for the elements returned
by f
. This functionality can be used, for example, if f
samples
each of its returned elements individually, updating each element using a
Metropolis step. See the examples for an illustration of this use case. In
such settings, f
should be constructed to receive s
as a vector
argument.
Value
Adaptive Metropolis sampler function of the form g(...).
References
Gareth O. Roberts & Jeffrey S. Rosenthal (2009) Examples of Adaptive MCMC, Journal of Computational and Graphical Statistics, 18:2, 349-367, doi: 10.1198/jcgs.2009.06134
Examples
# Sample from N(1, 2^2) ---------------------------------------------------
LogP <- function(x) dnorm(x, 1, 2, log=TRUE) # Target distribution
f <- function(x, s) { # Non-adaptive Metropolis sampler
x.prop <- x + rnorm(1, 0, s)
if(AcceptProp(LogP(x), LogP(x.prop))) {
x <- x.prop
}
return(x)
}
s.start <- 0.1
g <- Amwg(f, s.start, batch.size=25)
n.save <- 10000
Mcmc <- InitMcmc(n.save)
y <- 0
x <- 0
samples <- Mcmc({
y <- f(y, s.start) # Non-adaptive
x <- g(x) # Adaptive
})
plot(1:n.save, samples$x, ylim=c(-10, 10), main="Traceplots", xlab="Iteration",
ylab="Value", type='l')
lines(1:n.save, samples$y, col="red")
legend("bottomleft", legend=c("Adaptive", "Non-adaptive"),
col=c("black", "red"), lty=1, cex=0.8)
# Sample from Gamma(10, 5) ------------------------------------------------
LogP <- function(x) dgamma(x, 10, 5, log=TRUE) # Target distribution
f <- function(x, s) { # Non-adaptive Metropolis sampler
x.prop <- x + rnorm(1, 0, s)
if(AcceptProp(LogP(x), LogP(x.prop))) {
x <- x.prop
}
return(x)
}
s.start <- 10
stop.after <- 5000 # Stop changing the proposal SD after 5000 iterations
g <- Amwg(f, s.start, batch.size=25, stop.after=stop.after)
n.save <- 10000
Mcmc <- InitMcmc(n.save)
x <- 1
samples <- Mcmc({
x <- g(x)
})
hist(samples$x[stop.after:n.save,], xlab="x", main="Gamma(10, 5)", freq=FALSE)
curve(dgamma(x, 10, 5), from=0, to=max(samples$x), add=TRUE, col="blue")
# Overdispersed Poisson ---------------------------------------------------
## Likelihood:
## y_i|theta_i ~ Pois(theta_i), i=1,...,n
## Prior:
## theta_i ~ Log-Normal(mu, sigma^2)
## mu ~ Normal(m, v^2), m and v^2 fixed
## sigma^2 ~ InverseGamma(a, b), a and b fixed
SampleSigma2 <- function(theta.vec, mu, a, b, n.obs) {
1/rgamma(1, a + n.obs/2, b + (1/2)*sum((log(theta.vec) - mu)^2))
}
SampleMu <- function(theta.vec, sigma.2, m, v.2, n.obs) {
mu.var <- (1/v.2 + n.obs/sigma.2)^(-1)
mu.mean <- (m/v.2 + sum(log(theta.vec))/sigma.2) * mu.var
return(rnorm(1, mu.mean, sqrt(mu.var)))
}
LogDTheta <- function(theta, mu, sigma.2, y) {
dlnorm(theta, mu, sqrt(sigma.2), log=TRUE) + dpois(y, theta, log=TRUE)
}
# Non-adaptive Metropolis sampler
SampleTheta <- function(theta.vec, mu, sigma.2, y.vec, n.obs, s) {
theta.prop <- exp(log(theta.vec) + rnorm(n.obs, 0, s))
# Jacobians, because proposals are made on the log scale
j.curr <- log(theta.vec)
j.prop <- log(theta.prop)
log.curr <- LogDTheta(theta.vec, mu, sigma.2, y.vec) + j.curr
log.prop <- LogDTheta(theta.prop, mu, sigma.2, y.vec) + j.prop
theta.vec <- ifelse(AcceptProp(log.curr, log.prop), theta.prop, theta.vec)
return(theta.vec)
}
## Data
y.vec <- warpbreaks$breaks
n.obs <- length(y.vec)
## Setup adaptive Metropolis sampler
s <- rep(1, n.obs)
# s is a vector, so the acceptance rate of each component will be tracked
# individually in the adaptive Metropolis sampler
SampleThetaAdapative <- Amwg(SampleTheta, s)
## Set prior
v.2 <- 0.05
m <- log(30) - v.2/2
a <- 1
b <- 2
## Initialize parameters
theta.vec <- y.vec
mu <- m
## MCMC
Mcmc <- InitMcmc(10000)
samples <- Mcmc({
sigma.2 <- SampleSigma2(theta.vec, mu, a, b, n.obs)
mu <- SampleMu(theta.vec, sigma.2, m, v.2, n.obs)
theta.vec <- SampleThetaAdapative(theta.vec, mu, sigma.2, y.vec, n.obs)
})