spike.slab.prior {BoomSpikeSlab} | R Documentation |
Create a spike and slab prior for use with lm.spike.
Description
Creates a spike and slab prior for use with lm.spike.
Usage
SpikeSlabPrior(x,
y = NULL,
expected.r2 = .5,
prior.df = .01,
expected.model.size = 1,
prior.information.weight = .01,
diagonal.shrinkage = .5,
optional.coefficient.estimate = NULL,
max.flips = -1,
mean.y = mean(y, na.rm = TRUE),
sdy = sd(as.numeric(y), na.rm = TRUE),
prior.inclusion.probabilities = NULL,
sigma.upper.limit = Inf)
SpikeSlabPriorDirect(coefficient.mean,
coefficient.precision,
prior.inclusion.probabilities,
prior.df,
sigma.guess,
max.flips = -1,
sigma.upper.limit = Inf)
ConditionalZellnerPrior(xdim,
optional.coefficient.estimate = NULL,
expected.model.size = 1,
prior.information.weight = .01,
diagonal.shrinkage = .5,
max.flips = -1,
prior.inclusion.probabilities = NULL)
Arguments
x |
The design matrix for the regression problem. Missing data is not allowed. |
y |
The vector of responses for the regression. Missing data is
not allowed. If |
expected.r2 |
The expected R-square for the regression. The spike and slab prior
requires an inverse gamma prior on the residual variance of the
regression. The prior can be parameterized in terms of a guess at
the residual variance, and a "degrees of freedom" representing the
number of observations that the guess should weigh. The guess at
sigma^2 is set to |
prior.df |
A positive scalar representing the prior 'degrees of freedom' for
estimating the residual variance. This can be thought of as the
amount of weight (expressed as an observation count) given to the
|
expected.model.size |
A positive number less than |
prior.information.weight |
A positive scalar. Number of observations worth of weight that should be given to the prior estimate of beta. |
diagonal.shrinkage |
The conditionally Gaussian prior for beta (the "slab") starts with a
precision matrix equal to the information in a single observation.
However, this matrix might not be full rank. The matrix can be made
full rank by averaging with its diagonal. |
optional.coefficient.estimate |
If desired, an estimate of the regression coefficients can be supplied. In most cases this will be a difficult parameter to specify. If omitted then a prior mean of zero will be used for all coordinates except the intercept, which will be set to mean(y). |
max.flips |
The maximum number of variable inclusion indicators
the sampler will attempt to sample each iteration. If
|
mean.y |
The mean of the response vector, for use in cases when specifying the response vector is undesirable. |
xdim |
The dimension of the predictor matrix. |
sdy |
The standard deviation of the response vector, for use in cases when specifying the response vector is undesirable. |
prior.inclusion.probabilities |
A vector giving the prior probability of inclusion for each variable. |
sigma.upper.limit |
The largest acceptable value for the residual
standard deviation. A non-positive number is interpreted as
|
coefficient.mean |
The prior mean of the coefficients in the maximal model (with all variables included). |
coefficient.precision |
The prior precision (inverse variance) of the coefficients in the maximal model (with all variables included). |
sigma.guess |
Prior estimate of the residual standard deviation. |
Value
A list with with the components necessary to run lm.spike
.
SpikeSlabPrior
is intended for use in traditional regression
problems, when the matrix of predictors and the vector of responses
are available to the modeler.
ConditionalZellnerPrior
is intended for cases where the
predictor variables are potentially unknown, because they depend on
model parameters or latent variables, for example. For models that
support ConditionalZellnerPrior, the underlying C++ code must know
where to find the relevant predictors on which to condition the prior.
Author(s)
Steven L. Scott
References
George and McCulloch (1997), "Approaches to Bayesian Variable Selection", Statistica Sinica, 7, 339 – 373.
https://www3.stat.sinica.edu.tw/statistica/oldpdf/A7n26.pdf
Examples
x <- cbind(1, matrix(rnorm(900), ncol = 9))
beta <- rep(0, 10)
beta[1] <- 3
beta[5] <- -4
beta[8] <- 2
y <- rnorm(100, x %*% beta)
## x has 10 columns, including the intercept
prior <- SpikeSlabPrior(x, y,
expected.model.size = 3, # expect 3 nonzero predictors
prior.df = .01, # weaker prior than the default
prior.information.weight = .01,
diagonal.shrinkage = 0, # use Zellner's prior
optional.coefficient.estimate = rep(0, 10) # shrink to zero
)
## now 'prior' can be fed to 'lm.spike'
model <- lm.spike(y ~ x - 1, niter = 1000, prior = prior)