nested.regression {BoomSpikeSlab}R Documentation

Nested Regression

Description

Fits a Bayesian hierarchical regression model to data nested within groups. The model is

% y_{ig} \sim N(x_i \beta_g, \sigma^2) \\ % 1 / \sigma^2 \sim Gamma(df/2, ss/2) \\ % \beta_g \sim N(b, V) \\ %

Optional hyperprior distributions can be supplied to the prior parameters.

% b ~ N(prior.mean, prior.variance) \\ % V ~ InverseWishart(df, variance.guess). \\ %

Either hyperprior can be omitted, in which case the corresponding prior parameter is assumed fixed at the user-supplied value.

Usage

NestedRegression(response,
                 predictors,
                 group.id,
                 residual.precision.prior = NULL,
                 coefficient.prior = NULL,
                 coefficient.mean.hyperprior = NULL,
                 coefficient.variance.hyperprior = NULL,
                 suf = NULL,
                 niter,
                 ping = niter / 10,
                 sampling.method = c("ASIS", "DA"),
                 seed = NULL)

Arguments

response

A numeric vector. The response variable to be modeled.

predictors

A numeric matrix of predictor variables, including an intercept term if one is desired. The number of rows must match length(response).

group.id

A factor (or object that can be converted using as.factor) naming the group to which each entry in response belongs.

residual.precision.prior

An object of type SdPrior describing the prior distribution of the residual standard deviation.

coefficient.prior

An object of class MvnPrior, or NULL. If non-NULL this gives the initial values of the prior distribution of the regression coefficients in the nested regression model. This argument must be non-NULL if either coefficient.mean.hyperprior or coefficient.variance.hyperprior is NULL.

coefficient.mean.hyperprior

An object of class MvnPrior, specifying the hyperprior distribution for the mean of coefficient.prior. This argument can also be NULL, or FALSE. If NULL then a default prior will be used when learning the mean of the prior distribution. If FALSE then the mean of the prior distribution will not be learned; the mean of the coefficient.prior distribution will be assumed instead.

coefficient.variance.hyperprior

An object of class InverseWishartPrior, specifying the hyperprior distribution for the variance of coefficient.prior. This argument can also be NULL, or FALSE. If NULL then a default prior will be used when learning the variance of the prior distribution. If FALSE then the variance of the prior distribution will not be learned; the variance of the coefficient.prior distribution will be assumed instead.

suf

A list, where each entry is of type RegressionSuf, giving the sufficient statistics for each group, or NULL. If NULL, then suf will be computed from response, predictors, and group.id. If non-NULL then these arguments will not be accessed, in which case they can be left unspecified. In 'big data' problems this can be a significant computational savings.

niter

The desired number of MCMC iterations.

ping

The frequency with which to print status updates.

sampling.method

The MCMC sampling scheme that should be used. If either hyperprior is set to FALSE then the "DA" method will be used.

seed

The integer-valued seed (or NULL) to use for the C++ random number generator.

Details

Note: ASIS (Yu and Meng, 2011) has slightly better MCMC convergence, but is slightly slower than the classic DA (data augmentation) method, which alternates between sampling group-level regression coefficients and prior parameters. Both methods are pretty fast.

Value

A list containing MCMC draws from the posterior distribution of model parameters. Each of the following is a vector, matrix, or array, with first index corresponding to MCMC draws, and later indices to distinct parameters.

Author(s)

Steven L. Scott

Examples

SimulateNestedRegressionData <- function() {
  beta.hyperprior.mean <- c(8, 6, 7, 5)
  xdim <- length(beta.hyperprior.mean)
  beta.hyperprior.variance <-
    rWishart(2 * xdim, diag(rep(1, xdim)), inverse = TRUE)

  number.of.groups <- 27
  nobs.per.group = 23
  beta <- rmvn(number.of.groups,
               beta.hyperprior.mean,
               beta.hyperprior.variance)

  residual.sd <- 2.4
  X <- cbind(1, matrix(rnorm(number.of.groups * (xdim - 1) * nobs.per.group),
                       ncol = xdim - 1))
  group.id <- rep(1:number.of.groups, len = nrow(X))
  y.hat <- numeric(nrow(X))
  for (i in 1:nrow(X)) {
    y.hat[i] = sum(X[i, ] * beta[group.id[i], ])
  }
  y <- rnorm(length(y.hat), y.hat, residual.sd)
  suf <- BoomSpikeSlab:::.RegressionSufList(X, y, group.id)

  return(list(beta.hyperprior.mean = beta.hyperprior.mean,
              beta.hyperprior.variance = beta.hyperprior.variance,
              beta = beta,
              residual.sd = residual.sd,
              X = X,
              y = y,
              group.id = group.id,
              suf = suf))
}

d <- SimulateNestedRegressionData()
model <- NestedRegression(suf = d$suf, niter = 500)


[Package BoomSpikeSlab version 1.2.6 Index]