hef {bang}R Documentation

Hierarchical Exponential Family Model

Description

Produces random samples from the posterior distribution of the parameters of certain hierarchical exponential family models.

Usage

hef(n = 1000, model = c("beta_binom", "gamma_pois"), data, ...,
  prior = "default", hpars = NULL, param = c("trans", "original"),
  init = NULL, nrep = NULL)

Arguments

n

An integer scalar. The size of the posterior sample required.

model

A character string. Abbreviated name for the response-population distribution combination. For a hierarchical normal model see hanova1 (hierarchical one-way analysis of variance (ANOVA)).

data

A numeric matrix. The format depends on model. See Details.

...

Optional further arguments to be passed to ru.

prior

The log-prior for the parameters of the hyperprior distribution. If the user wishes to specify their own prior then prior must be an object returned from a call to set_user_prior. Otherwise, prior is a character scalar giving the name of the required in-built prior. If prior is not supplied then a default prior is used. See Details.

hpars

A numeric vector. Used to set parameters (if any) in an in-built prior.

param

A character scalar. If param = "trans" (the default) then the marginal posterior of hyperparameter vector φ is reparameterized in a way designed to improve the efficiency of sampling from this posterior. If param = "original" the original parameterization is used. The former tends to make the optimizations involved in the ratio-of-uniforms algorithm more stable and to increase the probability of acceptance, but at the expense of slower function evaluations.

init

A numeric vector of length 2. Optional initial estimates for the search for the mode of the posterior density of the hyperparameter vector φ.

nrep

A numeric scalar. If nrep is not NULL then nrep gives the number of replications of the original dataset simulated from the posterior predictive distribution. Each replication is based on one of the samples from the posterior distribution. Therefore, nrep must not be greater than n. In that event nrep is set equal to n.

Details

Conditional on population-specific parameter vectors θ1, ..., θJ the observed response data y1, ..., yJ within each population are modelled as random samples from a distribution in an exponential family. The population parameters θ1, ..., θJ are modelled as random samples from a common population distribution, chosen to be conditionally conjugate to the response distribution, with hyperparameter vector φ. Conditionally on θ1, ..., θJ, y1, ..., yJ are independent of each other and are independent of φ. A hyperprior is placed on φ. The user can either choose parameter values of a default hyperprior or specify their own hyperprior using set_user_prior.

The ru function in the rust package is used to draw a random sample from the marginal posterior of the hyperparameter vector φ. Then, conditional on these values, population parameters are sampled directly from the conditional posterior density of θ1, ..., θJ given φ and the data.

We outline each model, specify the format of the data, give the default (log-)priors (up to an additive constant) and detail the choices of ratio-of-uniforms parameterization param.

Beta-binomial: For j = 1, ..., J, Yj | pj are i.i.d binomial(nj, pj), where pj is the probability of success in group j and nj is the number of trials in group j. pj are i.i.d. beta(α, β), so and φ = (α, β). data is a 2-column matrix: the numbers of successes in column 1 and the corresponding numbers of trials in column 2.

Priors:

prior = "bda" (the default): log π(α, β) = - 2.5 log(α + β), α > 0, β > 0. [See Section 5.3 of Gelman et al. (2014).]

prior = "gamma": independent gamma priors on α and β, i.e. log π(α, β) = (s1 - 1)logα - r1 α + (s2 - 1)logβ - r2 β, α > 0, β > 0. where the respective shape (s1, s2) and rate (r1, r2) parameters are specified using hpars = (s1, r1, s2, r2). The default setting is hpars = c(1, 0.01, 1, 0.01).

Parameterizations for sampling:

param = "original" is (α, β), param = "trans" (the default) is φ1 = logit(α/(α+β)) = log(α/β), φ2 = log(α+β). See Section 5.3 of Gelman et al. (2014).

Gamma-Poisson: For j = 1, ..., J, Yj | λj are i.i.d Poisson(ejλj), where ej is the exposure in group j, based on the total length of observation time and/or size of the population at risk of the event of interest and λj is the mean number of events per unit of exposure. λj are i.i.d. gamma(α, β), so φ = (α, β). data is a 2-column matrix: the counts yj of the numbers of events in column 1 and the corresponding exposures ej in column 2.

Priors:

prior = "gamma" (the default): independent gamma priors on α and β, i.e. log π(α, β) = (s1 - 1)logα - r1 α + (s2 - 1)logβ - r2 β, α > 0, β > 0. where the respective shape (s1, s2) and rate (r1, r2) parameters are specified using hpars = (s1, r1, s2, r2). The default setting is hpars = c(1, 0.01, 1, 0.01).

Parameterizations for sampling:

param = "original" is (α, β), param = "trans" (the default) is φ1 = log(α/β), φ2 = log(β).

Value

An object (list) of class "hef", which has the same structure as an object of class "ru" returned from ru. In particular, the columns of the n-row matrix sim_vals contain the simulated values of φ. In addition this list contains the arguments model, data and prior detailed above, an n by J matrix theta_sim_vals: column j contains the simulated values of θj and call: the matched call to hef.

If nrep is not NULL then this list also contains data_rep, a numerical matrix with nrep columns. Each column contains a replication of the first column of the original data data[, 1], simulated from the posterior predictive distribution.

References

Gelman, A., Carlin, J. B., Stern, H. S. Dunson, D. B., Vehtari, A. and Rubin, D. B. (2014) Bayesian Data Analysis. Chapman & Hall / CRC. http://www.stat.columbia.edu/~gelman/book

See Also

The ru function in the rust package for details of the arguments that can be passed to ru via hef.

hanova1 for hierarchical one-way analysis of variance (ANOVA).

set_user_prior to set a user-defined prior.

Examples

############################ Beta-binomial #################################

# ------------------------- Rat tumor data ------------------------------- #

# Default prior, sampling on (rotated) (log(mean), log(alpha + beta)) scale
rat_res <- hef(model = "beta_binom", data = rat)

# Hyperparameters alpha and beta
plot(rat_res)
# Parameterization used for sampling
plot(rat_res, ru_scale = TRUE)

summary(rat_res)

# Choose rats with extreme sample probabilities
pops <- c(which.min(rat[, 1] / rat[, 2]), which.max(rat[, 1] / rat[, 2]))
# Population-specific posterior samples: separate plots
plot(rat_res, params = "pop", plot_type = "both", which_pop = pops)
# Population-specific posterior samples: one plot
plot(rat_res, params = "pop", plot_type = "dens", which_pop = pops,
     one_plot = TRUE, add_legend = TRUE)

# Default prior, sampling on (rotated) (alpha, beta) scale
rat_res <- hef(model = "beta_binom", data = rat, param = "original")

plot(rat_res)
plot(rat_res, ru_scale = TRUE)

summary(rat_res)

# To produce a plot akin to Figure 5.3 of Gelman et al. (2014) we
# (a) Use the same prior for (alpha, beta)
# (b) Don't use axis rotation (rotate = FALSE)
# (c) Plot on the scale used for ratio-of-uniforms sampling (ru_scale = TRUE)
# (d) Note that the mode is relocated to (0, 0) in the plot
rat_res <- hef(model = "beta_binom", data = rat, rotate = FALSE)

plot(rat_res, ru_scale = TRUE)

# This is the estimated location of the posterior mode
rat_res$f_mode

# User-defined prior, passing parameters
# (equivalent to prior = "gamma" with hpars = c(1, 0.01, 1, 0.01))
user_prior <- function(x, hpars) {
  return(dexp(x[1], hpars[1], log = TRUE) + dexp(x[2], hpars[2], log = TRUE))
}
user_prior_fn <- set_user_prior(user_prior, hpars = c(0.01, 0.01))
rat_res <- hef(model = "beta_binom", data = rat, prior = user_prior_fn)

plot(rat_res)

summary(rat_res)

############################ Gamma-Poisson #################################

# ------------------------ Pump failure data ------------------------------ #

pump_res <- hef(model = "gamma_pois", data = pump)
# Hyperparameters alpha and beta

plot(pump_res)

# Parameterization used for sampling
plot(pump_res, ru_scale = TRUE)
summary(pump_res)

# Choose pumps with extreme sample rates
pops <- c(which.min(pump[, 1] / pump[, 2]), which.max(pump[, 1] / pump[, 2]))
plot(pump_res, params = "pop", plot_type = "dens", which_pop = pops)

[Package bang version 1.0.1 Index]