| InitMcmc {overture} | R Documentation |
Initialize a Markov chain Monte Carlo run
Description
Eliminates much of the "boilerplate" code needed for MCMC implementations by looping through the samplers and saving the resulting draws automatically.
Usage
InitMcmc(n.save, backing.path = NA, thin = 1, exclude = NULL,
overwrite = FALSE)
Arguments
n.save |
number of samples to take. If |
backing.path |
|
thin |
thinning interval |
exclude |
character vector specifying variables that should not be saved |
overwrite |
TRUE/FALSE indicating whether previous MCMC results should be overwritten |
Details
InitMcmc returns a function that takes an R expression. The returned
function automatically loops through the R expression and saves any numeric
assignments, typically MCMC samples, that are made within it. exclude
specifies assignments that should not be saved. When exclude is
NULL, all the numeric assignments (scalar, vector, matrix, or array)
are saved. The dimensions of matrix and array assignments are not preserved;
they are flattened into vectors before saving. Non-numeric assignments are
not saved.
The number of iterations for the MCMC chain is determined by n.save
and thin. The desired number of samples to be saved from the target
distribution is set by n.save, and the chain is thinned according to
the interval set by thin. The MCMC chain will run for n.save
x thin iterations.
The MCMC samples can be saved either in-memory or on-disk. Unlike saving
in-memory, saving on-disk is not constrained by available RAM. Saving
on-disk can be used in high-dimensional settings where running multiple MCMC
chains in parallel and saving the results in-memory would use up all
available RAM. File-backed saving uses big.matrix, and the
behaviors of that implementation apply when saving on-disk. In particular,
big.matrix has call-by-reference rather than call-by-value
behavior, so care must be taken not to introduce unintended side-effects when
modifying these objects. In-memory saving is implemented via
matrix and has standard R behavior.
When backing.path is NA, samples will be saved in-memory. To
save samples on-disk, backing.path should specify the path to the
directory where the MCMC samples should be saved. The
big.matrix backingfiles will be saved in that directory,
with filenames corresponding to the variable assignment names made in the R
expression. Consequently, the assignment names in the R expression must be
chosen in such a way that they are compatible as filenames on the operating
system. The big.matrix descriptorfiles are also named
according to the variable assignment names made in the R expression, but with
a ".desc" suffix.
By default, InitMcmc will not overwrite the results from a previous
file-backed MCMC. This behavior can be overridden by specifying
overwrite=TRUE in InitMcmc, or as the second argument to the
function returned by InitMcmc. See the examples for an illustration.
overwrite is ignored for in-memory MCMC.
Value
A function that returns a list of either matrix or
big.matrix with the MCMC samples. Each row in the matrices
corresponds to one sample from the MCMC chain.
See Also
Examples
# Beta-binomial -----------------------------------------------------------
## Likelihood:
## x|theta ~ Binomial(n, theta)
## Prior:
## theta ~ Unif(0, 1)
theta.truth <- 0.75
n.obs <- 100
x <- rbinom(1, n.obs, prob=theta.truth)
# Sampling function
SampleTheta <- function() {
rbeta(1, 1 + x, 1 + n.obs - x)
}
# MCMC
Mcmc <- InitMcmc(1000)
samples <- Mcmc({
theta <- SampleTheta()
})
# Plot posterior distribution
hist(samples$theta, freq=FALSE, main="Posterior", xlab=expression(theta))
theta.grid <- seq(min(samples$theta), max(samples$theta), length.out=500)
lines(theta.grid, dbeta(theta.grid, 1 + x, 1 + n.obs - x), col="blue")
abline(v=theta.truth, col="red")
legend("topleft", legend=c("Analytic posterior", "Simulation truth"),
lty=1, col=c("blue", "red"), cex=0.75)
# Estimating mean with unknown variance -----------------------------------
## Likelihood:
## x|mu, sigma^2 ~ N(mu, sigma^2)
## Prior:
## p(mu) \propto 1
## p(sigma^2) \propto 1/sigma^2
# Simulated data
mu.truth <- 10
sigma.2.truth <- 2
n.obs <- 100
x <- rnorm(n.obs, mu.truth, sqrt(sigma.2.truth))
x.bar <- mean(x)
# Sampling functions
SampleMu <- function(sigma.2) {
rnorm(1, x.bar, sqrt(sigma.2/n.obs))
}
SampleSigma2 <- function(mu) {
1/rgamma(1, n.obs/2, (1/2)*sum((x-mu)^2))
}
# MCMC
Mcmc <- InitMcmc(1000, thin=10, exclude="sigma.2")
sigma.2 <- 1 # Initialize parameter
samples <- Mcmc({
mu <- SampleMu(sigma.2)
sigma.2 <- SampleSigma2(mu)
})
# Plot posterior distribution
hist(samples$mu, xlab=expression(mu), main="Posterior")
abline(v=mu.truth, col="red")
legend("topleft", legend="Simulation truth", lty=1, col="red", cex=0.75)
# sigma.2 is excluded from saved samples
is.null(samples$sigma.2)
# Linear regression -------------------------------------------------------
## Likelihood:
## y|beta, sigma^2, x ~ N(x %*% beta, sigma^2 * I)
## Prior:
## p(beta, sigma^2|x) \propto 1/sigma^2
# Simulated data
n.obs <- 100
x <- matrix(NA, nrow=n.obs, ncol=3)
x[,1] <- 1
x[,2] <- rnorm(n.obs)
x[,3] <- x[,2] + rnorm(n.obs)
beta.truth <- c(1, 2, 3)
sigma.2.truth <- 5
y <- rnorm(n.obs, x %*% beta.truth, sqrt(sigma.2.truth))
# Calculations for drawing beta
l.mod <- lm(y ~ x - 1)
beta.hat <- l.mod$coefficients
xtx.inv <- summary(l.mod)$cov.unscaled
xtx.inv.chol <- chol(xtx.inv)
# Calculations for drawing sigma.2
a.sigma.2 <- (n.obs - length(beta.hat))/2
b.sigma.2 <- (1/2) * t(y - x %*% beta.hat) %*% (y - x %*% beta.hat)
# Draw from multivariate normal
Rmvn <- function(mu, sigma.chol) {
d <- length(mu)
c(mu + t(sigma.chol) %*% rnorm(d))
}
SampleBeta <- function(sigma.2) {
Rmvn(beta.hat, xtx.inv.chol * sqrt(sigma.2))
}
SampleSigma2 <- function() {
1/rgamma(1, a.sigma.2, b.sigma.2)
}
# MCMC, samples saved on-disk
backing.path <- tempfile()
dir.create(backing.path)
Mcmc <- InitMcmc(1000, backing.path=backing.path)
samples <- Mcmc({
sigma.2 <- SampleSigma2()
beta <- SampleBeta(sigma.2)
})
# Plot residuals using predictions made from the posterior mean of beta
y.hat <- x %*% colMeans(samples$beta[,])
plot(y.hat, y-y.hat, xlab="Predicted", ylab="Residual")
abline(h=0, col="red")
# Overwrite previous results ----------------------------------------------
### Overwrite specified in InitMcmc
backing.path <- tempfile()
dir.create(backing.path)
Mcmc <- InitMcmc(5, backing.path=backing.path, overwrite=TRUE)
samples <- Mcmc({
x <- 1
})
samples <- Mcmc({
x <- 2
})
samples$x[,]
### Overwrite specified in the function returned by InitMcmc
backing.path <- tempfile()
dir.create(backing.path)
Mcmc <- InitMcmc(5, backing.path=backing.path, overwrite=FALSE)
samples <- Mcmc({
x <- 3
})
samples <- Mcmc({
x <- 4
}, overwrite=TRUE)
samples$x[,]