pd_RW_emiss_cat {mHMMbayes}R Documentation

Proposal distribution settings RW Metropolis sampler for mHMM categorical emission distribution(s)

Description

pd_RW_emiss_cat provides a framework to manually specify the settings of the proposal distribution of the random walk (RW) Metropolis sampler of categorical emission distribution(s) of the multilevel hidden Markov model, and creates on object of the class mHMM_pdRW_emiss. The RW metropolis sampler is used for sampling the subject level parameter estimates relating to the emission distributions of the dependent variables k, that is, the Multinomial logistic regression intercepts.

Usage

pd_RW_emiss_cat(gen, emiss_int_mle0, emiss_scalar, emiss_w)

Arguments

gen

List containing the following elements denoting the general model properties:

  • m: numeric vector with length 1 denoting the number of hidden states

  • n_dep: numeric vector with length 1 denoting the number of dependent variables

  • q_emiss: only to be specified if the data represents categorical data. Numeric vector with length n_dep denoting the number of observed categories for the categorical emission distribution for each of the dependent variables.

emiss_int_mle0

A list containing n_dep elements corresponding to each of the dependent variables k, where each element is a matrix with m rows and q_emiss[k] - 1 columns denoting the starting values for the maximum likelihood (ML) estimates of the Multinomial logit regression intercepts of the emission distribution(s). ML parameters to be estimated are based on the pooled data (data over all subjects).

emiss_scalar

A list containing n_dep elements corresponding to each of the dependent variables, where each element is a numeric vector with length 1 denoting the scale factor s. That is, the scale of the proposal distribution is composed of a covariance matrix Sigma, which is then tuned by multiplying it by a scaling factor s^2.

emiss_w

A list containing n_dep elements corresponding to each of the dependent variables, where each element is a numeric vector with length 1 denoting the weight for the overall log likelihood (i.e., log likelihood based on the pooled data over all subjects) in the fractional likelihood.

Details

When no manual values for the settings of the proposal distribution of the random walk (RW) Metropolis sampler are specified at all (that is, the function pd_RW_emiss_cat is not used), all elements in emiss_int_mle0 set to 0, emiss_scalar set to 2.93 / sqrt(q_emiss[k] - 1), and emiss_w set to 0.1. See the section Scaling the proposal distribution of the RW Metropolis sampler in vignette("estimation-mhmm") for details.

Within the function mHMM, the acceptance rate of the RW metropolis sampler relating to the emission distribution(s) can be tracked using the output parameter emiss_naccept. An acceptance rate of about 23% is considered optimal when many parameters are being updated at once (Gelman, Carlin, Stern & Rubin, 2014).

Value

pd_RW_emiss_cat returns an object of class mHMM_pdRW_emiss, containing settings of the proposal distribution of the random walk (RW) Metropolis sampler on the categorical emission distribution(s) of the multilevel hidden Markov model. The object is specifically created and formatted for use by the function mHMM, and checked for correct input dimensions. The object contains the following components:

gen

A list containing the elements m, n_dep, and q_emiss, used for checking equivalent general model properties specified under pd_RW_emiss_cat and mHMM.

emiss_int_mle0

A list containing n_dep elements, where each element is a matrix containing the starting values for the maximum likelihood (ML) estimates of the Multinomial logit regression intercepts of the emission distribution(s).

emiss_scalar

A list containing n_dep elements denoting the scale factor s of the proposal distribution.

emiss_w

A list containing n_dep elements denoting denoting the weight for the overall log likelihood in the fractional likelihood.

References

Gelman A, Carlin JB, Stern HS, Rubin DB (2014). Bayesian Data Analysis vol. 2. Taylor & Francis.

Rossi PE, Allenby GM, McCulloch R (2012). Bayesian statistics and marketing. John Wiley & Sons.

Examples

###### Example using package example data, see ?nonverbal
# specifying general model properties:
m <- 3
n_dep <- 4
q_emiss <- c(3, 2, 3, 2)

# specifying manual values for RW metropolis sampler on emission distributions
emiss_int_mle0 <- list(matrix(c( 2,  0,
                                -2, -2,
                                 0, -1), byrow = TRUE, nrow = m, ncol = q_emiss[1] - 1),
                       matrix(c( 2,
                                 2,
                                 2), byrow = TRUE, nrow = m, ncol = q_emiss[2] - 1),
                       matrix(c(-2, -2,
                                 2,  0,
                                 0, -1), byrow = TRUE, nrow = m, ncol = q_emiss[3] - 1),
                       matrix(c( 2,
                                 2,
                                 2), byrow = TRUE, nrow = m, ncol = q_emiss[4] - 1))
emiss_scalar <- list(c(2), c(3), c(2), c(3))
emiss_w <- rep(list(c(0.2)), n_dep)
manual_emiss_sampler <- pd_RW_emiss_cat(gen = list(m = m, n_dep = n_dep, q_emiss = q_emiss),
                                        emiss_int_mle0 = emiss_int_mle0,
                                        emiss_scalar = emiss_scalar,
                                        emiss_w = emiss_w)

# specifying starting values
start_TM <- diag(.7, m)
start_TM[lower.tri(start_TM) | upper.tri(start_TM)] <- .1
start_EM <- list(matrix(c(0.05, 0.90, 0.05,
                          0.90, 0.05, 0.05,
                          0.55, 0.45, 0.05), byrow = TRUE,
                        nrow = m, ncol = q_emiss[1]), # vocalizing patient
                 matrix(c(0.1, 0.9,
                          0.1, 0.9,
                          0.1, 0.9), byrow = TRUE, nrow = m,
                        ncol = q_emiss[2]), # looking patient
                 matrix(c(0.90, 0.05, 0.05,
                          0.05, 0.90, 0.05,
                          0.55, 0.45, 0.05), byrow = TRUE,
                        nrow = m, ncol = q_emiss[3]), # vocalizing therapist
                 matrix(c(0.1, 0.9,
                          0.1, 0.9,
                          0.1, 0.9), byrow = TRUE, nrow = m,
                        ncol = q_emiss[4])) # looking therapist

# Note that for reasons of running time, J is set at a ridiculous low value.
# One would typically use a number of iterations J of at least 1000,
# and a burn_in of 200.

out_3st_RWemiss <- mHMM(s_data = nonverbal,
                         gen = list(m = m, n_dep = n_dep, q_emiss = q_emiss),
                         start_val = c(list(start_TM), start_EM),
                         emiss_sampler = manual_emiss_sampler,
                         mcmc = list(J = 11, burn_in = 5))

out_3st_RWemiss
summary(out_3st_RWemiss)

# checking acceptance rate (for illustrative purposes, in the example,
# J is too low for getting a fair indication)
div_J <- function(x, J) x / J
J_it <- 11 - 1 # accept/reject starts at iteration 2 of MCMC algorithm
RW_emiss_accept <- sapply(out_3st_RWemiss$emiss_naccept, div_J, J_it, simplify = FALSE)

# average acceptance rate over all subjects per parameter
# rows represent each of the n_dep dependent variables, columns represent the m states
t(sapply(RW_emiss_accept, apply, MARGIN = 2, mean))





[Package mHMMbayes version 1.1.0 Index]