R: Posterior Probabilities of Mixture Component Memberships

pp_mixture.brmsfit {brms}

R Documentation

Posterior Probabilities of Mixture Component Memberships

Description

Compute the posterior probabilities of mixture component memberships for each observation including uncertainty estimates.

Usage

## S3 method for class 'brmsfit'
pp_mixture(
  x,
  newdata = NULL,
  re_formula = NULL,
  resp = NULL,
  ndraws = NULL,
  draw_ids = NULL,
  log = FALSE,
  summary = TRUE,
  robust = FALSE,
  probs = c(0.025, 0.975),
  ...
)

pp_mixture(x, ...)

Arguments

`x`	An R object usually of class `brmsfit`.
`newdata`	An optional data.frame for which to evaluate predictions. If `NULL` (default), the original data of the model is used. `NA` values within factors are interpreted as if all dummy variables of this factor are zero. This allows, for instance, to make predictions of the grand mean when using sum coding.
`re_formula`	formula containing group-level effects to be considered in the prediction. If `NULL` (default), include all group-level effects; if `NA`, include no group-level effects.
`resp`	Optional names of response variables. If specified, predictions are performed only for the specified response variables.
`ndraws`	Positive integer indicating how many posterior draws should be used. If `NULL` (the default) all draws are used. Ignored if `draw_ids` is not `NULL`.
`draw_ids`	An integer vector specifying the posterior draws to be used. If `NULL` (the default), all draws are used.
`log`	Logical; Indicates whether to return probabilities on the log-scale.
`summary`	Should summary statistics be returned instead of the raw values? Default is `TRUE`.
`robust`	If `FALSE` (the default) the mean is used as the measure of central tendency and the standard deviation as the measure of variability. If `TRUE`, the median and the median absolute deviation (MAD) are applied instead. Only used if `summary` is `TRUE`.
`probs`	The percentiles to be computed by the `quantile` function. Only used if `summary` is `TRUE`.
`...`	Further arguments passed to `prepare_predictions` that control several aspects of data validation and prediction.

Details

The returned probabilities can be written as P(Kn = k | Yn), that is the posterior probability that observation n originates from component k. They are computed using Bayes' Theorem

P(Kn = k | Yn) = P(Yn | Kn = k) P(Kn = k) / P(Yn),

where P(Yn | Kn = k) is the (posterior) likelihood of observation n for component k, P(Kn = k) is the (posterior) mixing probability of component k (i.e. parameter theta<k>), and

P(Yn) = \sum (k=1,...,K) P(Yn | Kn = k) P(Kn = k)

is a normalizing constant.

Value

If summary = TRUE, an N x E x K array, where N is the number of observations, K is the number of mixture components, and E is equal to length(probs) + 2. If summary = FALSE, an S x N x K array, where S is the number of posterior draws.

Examples

## Not run: 
## simulate some data
set.seed(1234)
dat <- data.frame(
  y = c(rnorm(100), rnorm(50, 2)),
  x = rnorm(150)
)
## fit a simple normal mixture model
mix <- mixture(gaussian, nmix = 2)
prior <- c(
  prior(normal(0, 5), Intercept, nlpar = mu1),
  prior(normal(0, 5), Intercept, nlpar = mu2),
  prior(dirichlet(2, 2), theta)
)
fit1 <- brm(bf(y ~ x), dat, family = mix,
            prior = prior, chains = 2, init = 0)
summary(fit1)

## compute the membership probabilities
ppm <- pp_mixture(fit1)
str(ppm)

## extract point estimates for each observation
head(ppm[, 1, ])

## classify every observation according to
## the most likely component
apply(ppm[, 1, ], 1, which.max)

## End(Not run)

[Package brms version 2.21.0 Index]