logit.spike {BoomSpikeSlab}  R Documentation 
Spike and slab logistic regression
Description
MCMC algorithm for logistic regression models with a 'spikeandslab' prior that places some amount of posterior probability at zero for a subset of the regression coefficients.
Usage
logit.spike(formula,
niter,
data,
subset,
prior = NULL,
na.action = options("na.action"),
contrasts = NULL,
drop.unused.levels = TRUE,
initial.value = NULL,
ping = niter / 10,
nthreads = 0,
clt.threshold = 2,
mh.chunk.size = 10,
proposal.df = 3,
sampler.weights = c("DA" = .333, "RWM" = .333, "TIM" = .333),
seed = NULL,
...)
Arguments
formula 
formula for the maximal model (with all variables
included), this is parsed the same way as a call to

niter 
The number of MCMC iterations to run. Be sure to include enough so you can throw away a burnin set. 
data 
An optional data frame, list or environment (or object coercible by 'as.data.frame' to a data frame) containing the variables in the model. If not found in 'data', the variables are taken from 'environment(formula)', typically the environment from which logit.spike' is called. 
subset 
An optional vector specifying a subset of observations to be used in the fitting process. 
prior 
A n object inheriting from

na.action 
A function which indicates what should happen when
the data contain 
contrasts 
An optional list. See the 
drop.unused.levels 
A logical value indicating whether factor levels that are unobserved should be dropped from the model. 
initial.value 
Initial value for the MCMC algorithm. Can either
be a numeric vector, a 
ping 
If positive, then print a status update to the console
every 
nthreads 
The number of CPUthreads to use for data augmentation. There is some small overhead to stopping and starting threads. For small data sets, thread overhead will make it faster to run single threaded. For larger data sets multithreading can speed things up substantially. This is all machine dependent, so please experiment. 
clt.threshold 
When the model is presented with binomial data
(i.e. when the response is a twocolumn matrix) the data
augmentation algorithm can be made more efficient by updating a
single, asymptotically normal scalar quantity for each unique value
of the predictors. The asymptotic result will be used whenever the
number of successes or failures exceeds 
mh.chunk.size 
The maximum number of coefficients to draw in a single "chunk" of a MetropolisHastings update. See details. 
proposal.df 
The degrees of freedom parameter to use in MetropolisHastings proposals. See details. 
sampler.weights 
The proportion of MCMC iterations spent in each of the three algorithms described in the Details section. This must be a vector of length 3, with names "DA", "RWM" and "TIM", containing nonnegative elements that sum to (within numerical error .999 or 1.001 are okay). 
seed 
Seed to use for the C++ random number generator. It
should be 
... 
Extra arguments passed to

Details
Model parameters are updated using a composite of three MetropolisHastings updates. An auxiliary mixture sampling algorithm (Tuchler 2008) updates the entire parameter vector at once, but can mix slowly.
The second algorithm is a random walk Metropolis update based on a
multivariate T proposal with proposal.df
degrees of freedom.
If proposal.df
is nonpositive then a Gaussian proposal is used.
The variance of the proposal distribution is based on the Fisher
information matrix evaluated at the current draw of the coefficients.
The third algorithm is an independence Metropolis sampler centered on
the posterior mode with variance determined by posterior information
matrix (Fisher information plus prior information). If
proposal.df > 0
then the tails of the proposal are inflated so
that a multivariate T proposal is used instead.
For either of the two MH updates, at most mh.chunk.size
coefficients will be updated at a time. At each iteration, one of the
three algorithms is chosen at random. The auxiliary mixture sampler
is the only one that can change the dimension of the coefficient
vector. The MH algorithms only update the coefficients that are
currently nonzero.
Value
Returns an object of class logit.spike
, which inherits from
lm.spike
. The returned object is a list with the following
elements
beta 
A 
prior 
The prior used to fit the model. If a 
Author(s)
Steven L. Scott
References
Tuchler (2008), "Bayesian Variable Selection for Logistic Models Using Auxiliary Mixture Sampling", Journal of Computational and Graphical Statistics, 17 76 – 94.
See Also
lm.spike
SpikeSlabPrior
,
plot.logit.spike
,
PlotLogitSpikeFitSummary
PlotLogitSpikeResiduals
summary.logit.spike
,
predict.logit.spike
.
Examples
if (requireNamespace("MASS")) {
data(Pima.tr, package = "MASS")
data(Pima.te, package = "MASS")
pima < rbind(Pima.tr, Pima.te)
model < logit.spike(type == "Yes" ~ ., data = pima, niter = 500)
plot(model)
plot(model, "fit")
plot(model, "residuals")
plot(model, "size")
summary(model)
}