processMCMCChain {batchmix}R Documentation

Process MCMC chain

Description

Applies a burn in to and finds a point estimate for the output of “batchSemiSupervisedMixtureModel“.

Usage

processMCMCChain(mcmc_output, burn, point_estimate_method = "median")

Arguments

mcmc_output

Output from “batchSemiSupervisedMixtureModel“

burn

The number of MCMC samples to drop as part of a burn in.

point_estimate_method

Summary statistic used to define the point estimate. Must be “'mean'“ or “'median'“. “'median'“ is the default.

Value

A named list similar to the output of “batchSemiSupervisedMixtureModel“ with some additional entries:

* “mean_est“: $(P x K)$ matrix. The point estimate of the cluster means with columns corresponding to clusters.

* “cov_est“: $(P x P x K)$ array. The point estimate of the cluster covariance matrices with slices corresponding to clusters.

* “shift_est“: $(P x B)$ matrix. The point estimate of the batch shift effect with columns corresponding to batches.

* “scale_est“: $(P x B)$ matrix. The point estimate of the batch scale effects. The $bth$ column contains the diagonal entries of the scaling matrix for the $bth£ batch.

* “mean_sum_est“: $(P x K x B)$ array. The point estimate of the sum of the cluster means and the batch shift effect with columns corresponding to clusters and slices to batches.

* “cov_comb_est“: List of length $B$, with each entry being a $(P x P x K)$ array. The point estimate of the combination of the cluster covariance matrices and the batch scale effect with list entries corresponding to batches and slices of each array corresponding to clusters.

* “inferred_dataset“: $(N x P)$ matrix. The inferred “batch-free” dataset.

* “allocation_probability“: $(N x K)$ matrix. The point estimate of the allocation probabilities for each data point to each class.

* “prob“: $N$ vector. The point estimate of the probability of being allocated to the class with the highest probability.

* “pred“: $N$ vector. The predicted class for each sample.

Examples


# Data in a matrix format
X <- matrix(c(rnorm(100, 0, 1), rnorm(100, 3, 1)), ncol = 2, byrow = TRUE)

# Initial labelling
labels <- c(
  rep(1, 10),
  sample(c(1, 2), size = 40, replace = TRUE),
  rep(2, 10),
  sample(c(1, 2), size = 40, replace = TRUE)
)

fixed <- c(rep(1, 10), rep(0, 40), rep(1, 10), rep(0, 40))

# Batch
batch_vec <- sample(seq(1, 5), replace = TRUE, size = 100)

# Sampling parameters
R <- 1000
burn <- 250
thin <- 50

# MCMC samples
samples <- runBatchMix(X, R, thin, batch_vec, "MVN",
  initial_labels = labels,
  fixed = fixed
)

# Process the MCMC samples
processed_samples <- processMCMCChain(samples, burn)


[Package batchmix version 2.1.0 Index]