mlm.spike {BoomSpikeSlab} | R Documentation |
Spike and slab multinomial logistic regression
Description
MCMC algorithm for multinomial logist models with a 'spike-and-slab' prior that places some amount of posterior probability at zero for a subset of the regression coefficients.
Usage
mlm.spike(subject.formula,
choice.formula = NULL,
niter,
data,
choice.name.separator = ".",
contrasts = NULL,
subset,
prior = NULL,
ping = niter / 10,
proposal.df = 3,
rwm.scale.factor = 1,
nthreads = 1,
mh.chunk.size = 10,
proposal.weights = c("DA" = .5, "RWM" = .25, "TIM" = .25),
seed = NULL,
...)
Arguments
subject.formula |
A model |
choice.formula |
A model |
niter |
The number of MCMC iterations to run. Be sure to include enough so you can discard a burn-in set. |
data |
A data frame containing the data referenced in
|
choice.name.separator |
The character used to separate the predictor names from the choice values for the choice-level predictor variables in 'data'. |
contrasts |
An optional list. See the |
subset |
An optional vector specifying a subset of observations to be used in the fitting process. |
prior |
An object of class
A convenience function: |
ping |
The frequency with which status updates are printed to the
console. Measured in MCMC iterations. If |
proposal.df |
The "degrees of freedom" parameter that the
Metropolis-Hastings algorithm should use for the multivariate T
proposal distribution. If |
rwm.scale.factor |
The scale factor to use for random walk Metropolis updates. See details. |
nthreads |
The number of CPU-threads to use for data augmentation. |
mh.chunk.size |
The maximum number of coefficients to draw in a single "chunk" of a Metropolis-Hastings update. See details. |
proposal.weights |
A vector of 3 probabilities (summing to 1) indicating the probability of each type of MH proposal during each iteration. The weights should be given names "DA", "RWM", and "TIM" for clarity. |
seed |
Seed to use for the C++ random number generator. It
should be |
... |
Extra arguments to be passed to |
Details
Model Details:
A multinomial logit model has two sets of predictors: one measuring characterisitcs of the subject making the choice, and the other measuring characteristics of the items being chosen. The model can be written
Pr(y[i] = m) \propto exp(beta.subject[, m] * x.subject[i, ]
+ beta.choice * x.choice[i, , m])
The coefficients in this model are beta.subject and beta.choice. beta.choice is a subject.xdim by ('nchoices' - 1) matrix. Each row multiplies the design matrix produced by subject.formula for a particular choice level, where the first choice level is omitted (logically set to zero) for identifiability. beta.choice is a vector multiplying the design matrix produced by choice.formula, and thre are 'nchoices' of such matrices.
The coefficient vector 'beta' is the concatenation c(beta.subject, beta.choice), where beta.subject is vectorized by stacking its columns (in the usual R fashion). This means that the first contiguous region of beta contains the subject-level coefficients for choice level 2.
MCMC Details:
The MCMC algorithm randomly moves between three tyes of updates: data augmentation, random walk Metropolis (RWM), and tailored independence Metropolis (TIM).
DA: Each observation in the model is associated with a set of latent variables that renders the complete data posterior distribution conditionally Gaussian. The augmentation scheme is described in Tuchler (2008). The data augmentation algorithm conditions on the latent data, and integrates out the coefficients, to sample the inclusion vector (i.e. the vector of indicators showing which coefficients are nonzero) using Gibbs sampling. Then the coefficients are sampled given complete data conditional on inclusion. This is the only move that attemps a dimension change.
RWM: A chunk of the coefficient vector (up to mh.chunk.size) is selected. The proposal distribution is either multivariate normal or multivariate T (depending on 'proposal.df') centered on current values of this chunk. The precision parameter of the normal (or T) is the negative Hessian of the un-normalized log posterior, evaluated at the current value. The precision is divided by rwm.scale.factor. Only coefficients currently included in the model at the time of the proposal will be modified.
TIM: A chunk of the coefficient vector (up to mh.chunk.size) is selected. The proposal distribution is constructed by locating the posterior mode (using the current value as a starting point). The proposal is a Gaussian (or multivariate T) centered on the posterior mode, with precision equal to the negative Hessian evaluated at the mode. This is an expensive, but effective step. If the posterior mode finding fails (for numerical reasons) then a RWM proposal will be attempted instead.
Value
Returns an object of class mlm.spike
, which inherits from
logit.spike
and lm.spike
. The returned object is a list
with the following elements
beta |
A |
prior |
The prior used to fit the model. If a |
MH.accounting |
A summary of the amount of time spent in each type of MCMC move, and the acceptance rate for each move type. |
Author(s)
Steven L. Scott
References
Tuchler (2008), "Bayesian Variable Selection for Logistic Models Using Auxiliary Mixture Sampling", Journal of Computational and Graphical Statistics, 17 76 – 94.
See Also
lm.spike
SpikeSlabPrior
,
plot.lm.spike
,
summary.lm.spike
,
predict.lm.spike
.
Examples
rmulti <- function (prob) {
## Sample from heterogeneous multinomial distributions.
if (is.vector(prob)) {
S <- length(prob)
return(sample(1:S, size = 1, prob = prob))
}
nc <- apply(prob, 1, sum)
n <- nrow(prob)
S <- ncol(prob)
u <- runif(n, 0, nc)
alive <- rep(TRUE, n)
z <- numeric(n)
p <- rep(0, n)
for (s in 1:S) {
p <- p + prob[, s]
indx <- alive & (u < p)
alive[indx] <- FALSE
z[indx] <- s
if (!any(alive))
break
}
return(z)
}
## Define sizes for the problem
subject.predictor.dimension <- 3
choice.predictor.dimension <- 4
nchoices <- 5
nobs <- 1000
## The response can be "a", "b", "c", ...
choice.levels <- letters[1:nchoices]
## Create "subject level characteristics".
subject.x <- matrix(rnorm(nobs * (subject.predictor.dimension - 1)),
nrow = nobs)
subject.beta <- cbind(
0, matrix(rnorm(subject.predictor.dimension * (nchoices - 1)),
ncol = nchoices - 1))
colnames(subject.x) <- state.name[1:ncol(subject.x)]
## Create "choice level characteristics".
choice.x <- matrix(rnorm(nchoices * choice.predictor.dimension * nobs),
nrow = nobs)
choice.characteristics <- c("foo", "bar", "baz", "qux")
choice.names <- as.character(outer(choice.characteristics, choice.levels, FUN = paste, sep = ":"))
colnames(choice.x) <- choice.names
choice.beta <- rnorm(choice.predictor.dimension)
## Combine an intercept term, subject data, and choice data.
X <- cbind(1, subject.x, choice.x)
p <- ncol(X)
true.beta <- c(subject.beta[, -1], choice.beta)
Beta <- matrix(nrow = nchoices, ncol = p)
for (m in 1:nchoices) {
Beta[m, ] <- rep(0, p)
Beta[m, 1:subject.predictor.dimension] <- subject.beta[, m]
begin <- subject.predictor.dimension + 1 + (m-1) * choice.predictor.dimension
end <- begin + choice.predictor.dimension - 1
Beta[m, begin:end] <- choice.beta
}
eta <- X %*% t(Beta)
prob <- exp(eta)
prob <- prob / rowSums(prob)
response <- as.factor(choice.levels[rmulti(prob)])
simulated.data <- as.data.frame(X[, -1])
simulated.data$response <- response
# NOTE: The number of MCMC iterations is artificially small to reduce
# the run time.
model <- mlm.spike(response ~ Alabama + Alaska,
response ~ foo + bar + baz + qux,
niter = 100,
choice.name.separator = ":",
expected.subject.model.size = -1,
expected.choice.model.size = -1,
data = simulated.data,
proposal.weights = c("DA" = .8, "RWM" = .1, "TIM" = .1))