InitMcmc {overture} | R Documentation |
Initialize a Markov chain Monte Carlo run
Description
Eliminates much of the "boilerplate" code needed for MCMC implementations by looping through the samplers and saving the resulting draws automatically.
Usage
InitMcmc(n.save, backing.path = NA, thin = 1, exclude = NULL,
overwrite = FALSE)
Arguments
n.save |
number of samples to take. If |
backing.path |
|
thin |
thinning interval |
exclude |
character vector specifying variables that should not be saved |
overwrite |
TRUE/FALSE indicating whether previous MCMC results should be overwritten |
Details
InitMcmc
returns a function that takes an R expression. The returned
function automatically loops through the R expression and saves any numeric
assignments, typically MCMC samples, that are made within it. exclude
specifies assignments that should not be saved. When exclude
is
NULL
, all the numeric assignments (scalar, vector, matrix, or array)
are saved. The dimensions of matrix and array assignments are not preserved;
they are flattened into vectors before saving. Non-numeric assignments are
not saved.
The number of iterations for the MCMC chain is determined by n.save
and thin
. The desired number of samples to be saved from the target
distribution is set by n.save
, and the chain is thinned according to
the interval set by thin
. The MCMC chain will run for n.save
x
thin
iterations.
The MCMC samples can be saved either in-memory or on-disk. Unlike saving
in-memory, saving on-disk is not constrained by available RAM. Saving
on-disk can be used in high-dimensional settings where running multiple MCMC
chains in parallel and saving the results in-memory would use up all
available RAM. File-backed saving uses big.matrix
, and the
behaviors of that implementation apply when saving on-disk. In particular,
big.matrix
has call-by-reference rather than call-by-value
behavior, so care must be taken not to introduce unintended side-effects when
modifying these objects. In-memory saving is implemented via
matrix
and has standard R behavior.
When backing.path
is NA
, samples will be saved in-memory. To
save samples on-disk, backing.path
should specify the path to the
directory where the MCMC samples should be saved. The
big.matrix
backingfile
s will be saved in that directory,
with filenames corresponding to the variable assignment names made in the R
expression. Consequently, the assignment names in the R expression must be
chosen in such a way that they are compatible as filenames on the operating
system. The big.matrix
descriptorfile
s are also named
according to the variable assignment names made in the R expression, but with
a ".desc" suffix.
By default, InitMcmc
will not overwrite the results from a previous
file-backed MCMC. This behavior can be overridden by specifying
overwrite=TRUE
in InitMcmc
, or as the second argument to the
function returned by InitMcmc
. See the examples for an illustration.
overwrite
is ignored for in-memory MCMC.
Value
A function that returns a list of either matrix
or
big.matrix
with the MCMC samples. Each row in the matrices
corresponds to one sample from the MCMC chain.
See Also
Examples
# Beta-binomial -----------------------------------------------------------
## Likelihood:
## x|theta ~ Binomial(n, theta)
## Prior:
## theta ~ Unif(0, 1)
theta.truth <- 0.75
n.obs <- 100
x <- rbinom(1, n.obs, prob=theta.truth)
# Sampling function
SampleTheta <- function() {
rbeta(1, 1 + x, 1 + n.obs - x)
}
# MCMC
Mcmc <- InitMcmc(1000)
samples <- Mcmc({
theta <- SampleTheta()
})
# Plot posterior distribution
hist(samples$theta, freq=FALSE, main="Posterior", xlab=expression(theta))
theta.grid <- seq(min(samples$theta), max(samples$theta), length.out=500)
lines(theta.grid, dbeta(theta.grid, 1 + x, 1 + n.obs - x), col="blue")
abline(v=theta.truth, col="red")
legend("topleft", legend=c("Analytic posterior", "Simulation truth"),
lty=1, col=c("blue", "red"), cex=0.75)
# Estimating mean with unknown variance -----------------------------------
## Likelihood:
## x|mu, sigma^2 ~ N(mu, sigma^2)
## Prior:
## p(mu) \propto 1
## p(sigma^2) \propto 1/sigma^2
# Simulated data
mu.truth <- 10
sigma.2.truth <- 2
n.obs <- 100
x <- rnorm(n.obs, mu.truth, sqrt(sigma.2.truth))
x.bar <- mean(x)
# Sampling functions
SampleMu <- function(sigma.2) {
rnorm(1, x.bar, sqrt(sigma.2/n.obs))
}
SampleSigma2 <- function(mu) {
1/rgamma(1, n.obs/2, (1/2)*sum((x-mu)^2))
}
# MCMC
Mcmc <- InitMcmc(1000, thin=10, exclude="sigma.2")
sigma.2 <- 1 # Initialize parameter
samples <- Mcmc({
mu <- SampleMu(sigma.2)
sigma.2 <- SampleSigma2(mu)
})
# Plot posterior distribution
hist(samples$mu, xlab=expression(mu), main="Posterior")
abline(v=mu.truth, col="red")
legend("topleft", legend="Simulation truth", lty=1, col="red", cex=0.75)
# sigma.2 is excluded from saved samples
is.null(samples$sigma.2)
# Linear regression -------------------------------------------------------
## Likelihood:
## y|beta, sigma^2, x ~ N(x %*% beta, sigma^2 * I)
## Prior:
## p(beta, sigma^2|x) \propto 1/sigma^2
# Simulated data
n.obs <- 100
x <- matrix(NA, nrow=n.obs, ncol=3)
x[,1] <- 1
x[,2] <- rnorm(n.obs)
x[,3] <- x[,2] + rnorm(n.obs)
beta.truth <- c(1, 2, 3)
sigma.2.truth <- 5
y <- rnorm(n.obs, x %*% beta.truth, sqrt(sigma.2.truth))
# Calculations for drawing beta
l.mod <- lm(y ~ x - 1)
beta.hat <- l.mod$coefficients
xtx.inv <- summary(l.mod)$cov.unscaled
xtx.inv.chol <- chol(xtx.inv)
# Calculations for drawing sigma.2
a.sigma.2 <- (n.obs - length(beta.hat))/2
b.sigma.2 <- (1/2) * t(y - x %*% beta.hat) %*% (y - x %*% beta.hat)
# Draw from multivariate normal
Rmvn <- function(mu, sigma.chol) {
d <- length(mu)
c(mu + t(sigma.chol) %*% rnorm(d))
}
SampleBeta <- function(sigma.2) {
Rmvn(beta.hat, xtx.inv.chol * sqrt(sigma.2))
}
SampleSigma2 <- function() {
1/rgamma(1, a.sigma.2, b.sigma.2)
}
# MCMC, samples saved on-disk
backing.path <- tempfile()
dir.create(backing.path)
Mcmc <- InitMcmc(1000, backing.path=backing.path)
samples <- Mcmc({
sigma.2 <- SampleSigma2()
beta <- SampleBeta(sigma.2)
})
# Plot residuals using predictions made from the posterior mean of beta
y.hat <- x %*% colMeans(samples$beta[,])
plot(y.hat, y-y.hat, xlab="Predicted", ylab="Residual")
abline(h=0, col="red")
# Overwrite previous results ----------------------------------------------
### Overwrite specified in InitMcmc
backing.path <- tempfile()
dir.create(backing.path)
Mcmc <- InitMcmc(5, backing.path=backing.path, overwrite=TRUE)
samples <- Mcmc({
x <- 1
})
samples <- Mcmc({
x <- 2
})
samples$x[,]
### Overwrite specified in the function returned by InitMcmc
backing.path <- tempfile()
dir.create(backing.path)
Mcmc <- InitMcmc(5, backing.path=backing.path, overwrite=FALSE)
samples <- Mcmc({
x <- 3
})
samples <- Mcmc({
x <- 4
}, overwrite=TRUE)
samples$x[,]