restrictedGibbsMergeSplit {sams} | R Documentation |
Merge-Split Sampling for a Partition Based on Restricted Gibbs Scans
Description
Merge-split proposals for conjugate "Chinese Restaurant Process" (CRP) mixture models using restricted Gibbs scans from a uniformly random launch state, as presented in Jain & Neal (2004), with additional functionality for the two parameter CRP prior.
Usage
restrictedGibbsMergeSplit(
partition,
logPosteriorPredictiveDensity = function(i, subset) 0,
t = 1,
mass = 1,
discount = 0,
nUpdates = 1L,
selectionWeights = NULL
)
Arguments
partition |
A numeric vector of cluster labels representing the current partition. |
logPosteriorPredictiveDensity |
A function taking an index |
t |
A non-negative integer indicating the number of restricted Gibbs scans to perform for each merge/split proposal. |
mass |
A specification of the mass (concentration) parameter in the CRP
prior. Must be greater than the |
discount |
A numeric value on the interval [0,1) corresponding to the discount parameter in the two parameter CRP prior. |
nUpdates |
An integer giving the number of merge-split proposals before returning. This has the effect of thinning the Markov chain. |
selectionWeights |
A matrix or data frame whose first two columns are the unique pairs of data indices, along with a column of weights representing how likely each pair is to be selected at the beginning of each merge-split update. |
Value
- partition
An integer vector giving the updated partition encoded using cluster labels.
- accept
The acceptance rate of the Metropolis-Hastings proposals, i.e. the number accepted proposals divided by
nUpdates
.
References
Jain, S., & Neal, R. M. (2004). A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model. Journal of computational and Graphical Statistics, 13(1), 158-182.
Examples
# Neal (2000) model and data
nealData <- c(-1.48, -1.40, -1.16, -1.08, -1.02, 0.14, 0.51, 0.53, 0.78)
mkLogPosteriorPredictiveDensity <- function(data = nealData,
sigma2 = 0.1^2,
mu0 = 0,
sigma02 = 1) {
function(i, subset) {
posteriorVariance <- 1 / ( 1/sigma02 + length(subset)/sigma2 )
posteriorMean <- posteriorVariance * ( mu0/sigma02 + sum(data[subset])/sigma2 )
posteriorPredictiveSD <- sqrt(posteriorVariance + sigma2)
dnorm(data[i], posteriorMean, posteriorPredictiveSD, log = TRUE)
}
}
logPostPredict <- mkLogPosteriorPredictiveDensity()
nSamples <- 1000L
partitions <- matrix(0, nrow = nSamples, ncol = length(nealData))
accept <- 0
for ( i in 2:nSamples ) {
ms <- restrictedGibbsMergeSplit(partitions[i-1,],
logPostPredict,
t = 1,
mass = 1.0,
nUpdates = 2)
partitions[i,] <- ms$partition
accept <- accept + ms$accept
}
accept / nSamples # M-H acceptance rate
# convergence and mixing diagnostics
nSubsets <- apply(partitions, 1, function(x) length(unique(x)))
mean(nSubsets)
sum(acf(nSubsets)$acf)-1 # Autocorrelation time
entropy <- apply(partitions, 1, partitionEntropy)
plot.ts(entropy)