bms {BMS} | R Documentation |
Bayesian Model Sampling and Averaging
Description
Given data and prior information, this function samples all possible model combinations via MC3 or enumeration and returns aggregate results.
Usage
bms(
X.data,
burn = 1000,
iter = NA,
nmodel = 500,
mcmc = "bd",
g = "UIP",
mprior = "random",
mprior.size = NA,
user.int = TRUE,
start.value = NA,
g.stats = TRUE,
logfile = FALSE,
logstep = 10000,
force.full.ols = FALSE,
fixed.reg = numeric(0)
)
Arguments
X.data |
a data frame or a matrix, with the dependent variable in the
first column, followed by the covariates (alternatively, |
burn |
The (positive integer) number of burn-in draws for the MC3 sampler, defaults to 1000. (Not taken into account if mcmc="enumerate") |
iter |
If mcmc is set to an MC3 sampler, then this is the number of
iteration draws to be sampled (ex burn-ins), default 3000 draws. |
nmodel |
the number of best models for which information is stored
(default 500). Best models are used for convergence analysis between
likelihoods and MCMC frequencies, as well as likelihood-based inference. |
mcmc |
a character denoting the model sampler to be used. |
g |
the hyperparameter on Zellner's g-prior for the regression
coefficients. |
mprior |
a character denoting the model prior choice, defaulting to
"random": |
mprior.size |
if |
user.int |
'interactive mode': print out results to console after ending the routine and plots a chart (default TRUE). |
start.value |
specifies the starting model of the iteration chain. For
instance a specific model by the corresponding column indices (e.g.
starting.model=numeric(K) starts from the null model including solely a
constant term) or |
g.stats |
|
logfile |
setting |
logstep |
specifies at which number of posterior draws information is written to the log file; default: 10 000 iterations |
force.full.ols |
default FALSE. If |
fixed.reg |
indices or variable names of |
Details
Ad mcmc
:
Interaction sampler: adding an ".int" to an MC3 sampler
(e.g. "mcmc="bd.int") provides for special treatment of interaction terms.
Interaction terms will only be sampled along with their component variables:
In the colnumn names of X.data, interaction terms need to be denominated by
names consisting of the base terms separated by #
(e.g. an
interaction term of base variables "A"
, "B"
and "C"
needs column name "A#B#C"
). Then variable "A#B#C"
will only be
included in a model if all of the component variables ("A", "B", and "C")
are included.
The MC3 samplers "bd
", "rev.jump
", "bd.int
" and
"rev.jump.int
", iterate away from a starting model by adding,
dropping or swapping (only in the case of rev.jump) covariates.
In an MCMC fashion, they thus randomly draw a candidate model and then move to it in case its marginal likelihood (marg.lik.) is superior to the marg.lik. of the current model.
In case the candidate's marg.lik is inferior, it is randomly accepted or rejected according to a probability formed by the ratio of candidate marg.lik over current marg.lik. Over time, the sampler should thus converge to a sensible distribution. For aggregate results based on these MC3 frequencies, the first few iterations are typically disregarded (the 'burn-ins').
Ad g
and the hyper-g prior: The hyper-g prior introduced by Liang et
al. (2008) puts a prior distribution on the shrinkage factor g/(1+g)
,
namely a Beta distribution Beta(1, 1/2-1)
that is governed by the
parameter a
. a=4
means a uniform prior distribution of the
shrinkage factor, while a>2
close to 2 concentrates the prior
shrinkage factor close to one.
The prior expected value is
E(g/1+g)) = 2/a
. In this sense g="hyper=UIP"
and
g="hyper=BRIC"
set the prior expected shrinkage such that it conforms
to a fixed UIP-g (eqng=N) or BRIC-g (g=max(K^2,N)
).
Value
A list of class bma
, that may be displayed using e.g.
summary.bma
or coef.bma
. The list contains the
following elements:
info |
a list of aggregate statistics: |
arguments |
a list of the evaluated function arguments
provided to |
topmod |
a 'topmod' object
containing the best drawn models. see |
start.pos |
the positions of the starting model. If bmao is a'bma'
object this corresponds to covariates bmao$reg.names[bmao$start.pos]. If
bmao is a chain that resulted from several starting models (cf.
|
gprior.info |
a list of class |
mprior.info |
a list of class |
X.data |
data.frame or matrix: corresponds to
argument |
reg.names |
character vector: the covariate names to be used for X.data
(corresponds to |
bms.call |
the
original call to the |
Theoretical background
The models analyzed are Bayesian
normal-gamma conjugate models with improper constant and variance priors
akin to Fernandez, Ley and Steel (2001): A model M
can be described as
follows, with \epsilon
~ N(0,\sigma^2 I)
:
latex
f(\beta | \sigma, M, g) ~ N(0, g \sigma^2
(X'X)^-1)
Moreover, the (improper) prior on the constant f(\alpha)
is put
proportional to 1. Similarly, the variance prior f(\sigma)
is
proportional to 1/\sigma
.
Note
There are several ways to speed-up sampling: nmodel=10
saves
only the ten best models, at most a marginal improvement. nmodels=0
does not save the best (500) models, however then posterior convergence and
likelihood-based inference are not possible.
the best models, but not their coefficients, which renders the use of
image.bma
and the paramer exact=TRUE
in functions such as
coef.bma
infeasible. g.stats=FALSE
saves some time by not
retaining the shrinkage factors for the MC3 chain (and the best models).
force.fullobject=TRUE
in contrast, slows sampling down significantly
if mcmc="enumerate"
.
Author(s)
Martin Feldkircher, Paul Hofmarcher, and Stefan Zeugner
References
http://bms.zeugner.eu: BMS package homepage with help and tutorials
Feldkircher, M. and S. Zeugner (2015): Bayesian Model Averaging Employing Fixed and Flexible Priors: The BMS Package for R, Journal of Statistical Software 68(4).
Feldkircher, M. and S. Zeugner (2009): Benchmark Priors Revisited: On Adaptive Shrinkage and the Supermodel Effect in Bayesian Model Averaging, IMF Working Paper 09/202.
Fernandez, C. E. Ley and M. Steel (2001): Benchmark priors for Bayesian model averaging. Journal of Econometrics 100(2), 381–427
Ley, E. and M. Steel (2008): On the Effect of Prior Assumptions in Bayesian Model Averaging with Applications to Growth Regressions. working paper
Liang, F., Paulo, R., Molina, G., Clyde, M. A., and Berger, J. O. (2008). Mixtures of g Priors for Bayesian Variable Selection. Journal of the American Statistical Association 103, 410-423.
Sala-i-Martin, X. and G. Doppelhofer and R.I. Miller (2004): Determinants of long-term growth: a Bayesian averaging of classical estimates (BACE) approach. American Economic Review 94(4), 813–835
See Also
coef.bma
, plotModelsize
and
density.bma
for some operations on the resulting 'bma' object,
c.bma
for integrating separate MC3 chains and splitting of
sampling over several runs.
Check http://bms.zeugner.eu for additional help.
Examples
data(datafls)
#estimating a standard MC3 chain with 1000 burn-ins and 2000 iterations and uniform model priors
bma1 = bms(datafls,burn=1000, iter=2000, mprior="uniform")
##standard coefficients based on exact likelihoods of the 100 best models:
coef(bma1,exact=TRUE, std.coefs=TRUE)
#suppressing user-interactive output, using a customized starting value, and not saving the best
# ...models for only 19 observations (but 41 covariates)
bma2 = bms(datafls[20:39,],burn=1000, iter=2000, nmodel=0, start.value=c(1,4,7,30),
user.int=FALSE)
coef(bma2)
#MC3 chain with a hyper-g prior (custom coefficient a=2.1), saving only the 20 best models,
# ...and an alternative sampling procedure; putting a log entry to console every 1000th step
bma3 = bms(datafls,burn=1000, iter=5000, nmodel=20, g="hyper=2.1", mcmc="rev.jump",
logfile="",logstep=1000)
image(bma3) #showing the coefficient signs of the 20 best models
#enumerating with 10 covariates (= 1024 models), keeping the shrinkage factors
# ...of the best 200 models
bma4 = bms(datafls[,1:11],mcmc="enumerate",nmodel=200,g.stats=TRUE)
#using an interaction sampler for two interaction terms
dataint=datafls
dataint=cbind(datafls,datafls$LifeExp*datafls$Abslat/1000,
datafls$Protestants*datafls$Brit-datafls$Muslim)
names(dataint)[ncol(dataint)-1]="LifeExp#Abslat"
names(dataint)[ncol(dataint)]="Protestants#Brit#Muslim"
bma5 = bms(X.data=dataint,burn=1000,iter=9000,start.value=0,mcmc="bd.int")
density(bma5,reg="English") # plot posterior density for covariate "English"
# a matrix as X.data argument
bms(matrix(rnorm(1000),100,10))
# keeping a set of fixed regressors:
bms(datafls, mprior.size=7, fixed.reg = c("PrScEnroll", "LifeExp", "GDP60"))
# Note that mprior.size=7 means prior model size of 3 fixed to 4 'uncertain' regressors