Bvs {BayesVarSel} | R Documentation |
Bayesian Variable Selection for linear regression models
Description
Exact computation of summaries of the posterior distribution using sequential computation.
Usage
Bvs(
formula,
data,
null.model = paste(as.formula(formula)[[2]], " ~ 1", sep = ""),
prior.betas = "Robust",
prior.models = "ScottBerger",
n.keep = 10,
time.test = TRUE,
priorprobs = NULL,
parallel = FALSE,
n.nodes = detectCores()
)
Arguments
formula |
Formula defining the most complex (full) regression model in the analysis. See details. |
data |
data frame containing the data. |
null.model |
A formula defining which is the simplest (null) model. It should be nested in the full model. By default, the null model is defined to be the one with just the intercept. |
prior.betas |
Prior distribution for regression parameters within each model (to be literally specified). Possible choices include "Robust", "Robust.G", "Liangetal", "gZellner", "ZellnerSiow", "FLS", "intrinsic.MGC" and "IHG" (see details). |
prior.models |
Prior distribution over the model space (to be literally specified). Possible choices are "Constant", "ScottBerger" and "User" (see details). |
n.keep |
How many of the most probable models are to be kept? By default is set to 10, which is automatically adjusted if 10 is greater than the total number of models. |
time.test |
If TRUE and the number of variables is moderately large (>=18) a preliminary test to estimate computational time is performed. |
priorprobs |
A p+1 (p is the number of non-fixed covariates)
dimensional vector defining the prior probabilities Pr(M_i) (should be used
in the case where |
parallel |
A logical parameter specifying whether parallel computation must be used (if set to TRUE) |
n.nodes |
The number of cores to be used if parallel computation is used. |
Details
The model space is the set of all models, Mi, that contain the intercept and
are nested in that specified by formula
. The simplest of such models,
M0, contains only the intercept. Then Bvs
provides exact summaries of
the posterior distribution over this model space, that is, summaries of the
discrete distribution which assigns to each model Mi its probability given
the data:
Pr(Mi | data
)=Pr(Mi)*Bi/C,
where Bi is the Bayes factor of Mi to M0, Pr(Mi) is the prior probability of Mi and C is the normalizing constant.
The Bayes factor B_i depends on the prior assigned for the parameters in the regression
models Mi and Bvs
implements a number of popular choices. The "Robust"
prior by Bayarri, Berger, Forte and Garcia-Donato (2012) is the recommended (and default) choice.
This prior prior can be implemented in a more stable way using the derivations in Greenaway (2019)
and that are available in BayesVarSel since version 2.2.x setting the argument to "Robust.G".
Additional options are "gZellner" a prior which corresponds to the prior in Zellner (1986) with g=n. Also "Liangetal" prior is the hyper-g/n with a=3 (see the original paper Liang et al 2008, for details). "ZellnerSiow" is the multivariate Cauchy prior proposed by Zellner and Siow (1980, 1984), further studied by Bayarri and Garcia-Donato (2007). "FLS" is the (benchmark) prior recommended by Fernandez, Ley and Steel (2001) which is the prior in Zellner (1986) with g=max(n, p*p) p being the number of covariates to choose from (the most complex model has p+number of fixed covariates). "intrinsic.MGC" is the intrinsic prior derived by Moreno, Giron, Casella (2015) and "IHG" corresponds to the intrinsic hyper-g prior derived in Berger, Garcia-Donato, Moreno and Pericchi (2022).
With respect to the prior over the model space Pr(Mi) three possibilities are implemented: "Constant", under which every model has the same prior probability, "ScottBerger" under which Pr(Mi) is inversely proportional to the number of models of that dimension, and "User" (see below). The "ScottBerger" prior was studied by Scott and Berger (2010) and controls for multiplicity (default choice since version 1.7.0).
When the parameter prior.models
="User", the prior probabilities are
defined through the p+1 dimensional parameter vector priorprobs
. Let
k be the number of explanatory variables in the simplest model (the one
defined by fixed.cov
) then except for the normalizing constant, the
first component of priorprobs
must contain the probability of each
model with k covariates (there is only one); the second component of
priorprobs
should contain the probability of each model with k+1
covariates and so on. Finally, the p+1 component in priorprobs
defined the probability of the most complex model (that defined by
formula
. That is
priorprobs
[j]=Cprior*Pr(M_i such that M_i has j-1+k explanatory variables)
where Cprior is the normalizing constant for the prior, i.e
Cprior=1/sum(priorprobs*choose(p,0:p)
.
Note that prior.models
="Constant" is equivalent to the combination
prior.models
="User" and priorprobs=rep(1,(p+1))
but the
internal functions are not the same and you can obtain small variations in
results due to these differences in the implementation.
Similarly, prior.models
= "ScottBerger" is equivalent to the
combination prior.models
= "User" and priorprobs
=
1/choose(p,0:p)
.
The case where n<p is handled assigning to the Bayes factors of models with k regressors
with n<k a value of 1. This should be interpreted as a generalization of the
null predictive matching in Bayarri et al (2012). Use GibbsBvs
for cases where
p>>.
Limitations: about the error "A Bayes factor is infinite.". Bayes factors can be
extremely big numbers if i) the sample size is large or if
ii) a competing model is much better (in terms of fit) than the model taken as the
null model. If you see this error, try to use the more stable version of the
robust prior "Robust.g" and/or reconisder using more accurate and realistic definitions
of the simplest model (via the null.model
argument).
Value
Bvs
returns an object of class Bvs
with the following
elements:
time |
The internal time consumed in solving the problem |
lmfull |
The |
lmnull |
The
|
variables |
The name of all the potential explanatory variables (the set of variables to select from). |
n |
Number of observations |
p |
Number of explanatory variables to select from |
k |
Number of fixed variables |
HPMbin |
The binary expression of the Highest Posterior Probability model |
modelsprob |
A |
inclprob |
A named vector with the inclusion probabilities of all the variables. |
jointinclprob |
A |
postprobdim |
Posterior probabilities of the dimension of the true model |
call |
The
|
C |
The value of the normalizing constant (C=sum BiPr(Mi), for Mi in the model space) |
method |
|
prior.betas |
prior.betas |
prior.models |
prior.models |
priorprobs |
priorprobs |
Author(s)
Gonzalo Garcia-Donato and Anabel Forte
Maintainer: <anabel.forte@uv.es>
References
Bayarri, M.J., Berger, J.O., Forte, A. and Garcia-Donato, G. (2012)<DOI:10.1214/12-aos1013> Criteria for Bayesian Model choice with Application to Variable Selection. The Annals of Statistics. 40: 1550-1557.
Berger, J., Garcıa-Donato, G., Moreno, E., and Pericchi, L. (2022). The intrinsic hyper-g prior for normal linear models. in preparation.
Bayarri, M.J. and Garcia-Donato, G. (2007)<DOI:10.1093/biomet/asm014> Extending conventional priors for testing general hypotheses in linear models. Biometrika, 94:135-152.
Barbieri, M and Berger, J (2004)<DOI:10.1214/009053604000000238> Optimal Predictive Model Selection. The Annals of Statistics, 32, 870-897.
Fernandez, C., Ley, E. and Steel, M.F.J. (2001)<DOI:10.1016/s0304-4076(00)00076-2> Benchmark priors for Bayesian model averaging. Journal of Econometrics, 100, 381-427.
Greenaway, M. (2019) Numerically stable approximate Bayesian methods for generalized linear mixed models and linear model selection. Thesis (Department of Statistics, University of Sydney).
Liang, F., Paulo, R., Molina, G., Clyde, M. and Berger,J.O. (2008)<DOI:10.1198/016214507000001337> Mixtures of g-priors for Bayesian Variable Selection. Journal of the American Statistical Association. 103:410-423
Moreno, E., Giron, J. and Casella, G. (2015) Posterior model consistency in variable selection as the model dimension grows. Statistical Science. 30: 228-241.
Zellner, A. and Siow, A. (1980)<DOI:10.1007/bf02888369> Posterior Odds Ratio for Selected Regression Hypotheses. In Bayesian Statistics 1 (J.M. Bernardo, M. H. DeGroot, D. V. Lindley and A. F. M. Smith, eds.) 585-603. Valencia: University Press.
Zellner, A. and Siow, A. (1984). Basic Issues in Econometrics. Chicago: University of Chicago Press.
Zellner, A. (1986)<DOI:10.2307/2233941> On Assessing Prior Distributions and Bayesian Regression Analysis with g-prior Distributions. In Bayesian Inference and Decision techniques: Essays in Honor of Bruno de Finetti (A. Zellner, ed.) 389-399. Edward Elgar Publishing Limited.
See Also
Use print.Bvs
for the best visited models and an
estimation of their posterior probabilities and summary.Bvs
for
summaries of the posterior distribution.
plot.Bvs
for several plots of the result,
BMAcoeff
for obtaining model averaged simulations
of regression coefficients and predict.Bvs
for
predictions.
GibbsBvs
for a heuristic approximation based on
Gibbs sampling (recommended when p>20, no other possibilities when p>31).
See GibbsBvsF
if there are factors among the explanatory variables
Examples
## Not run:
#Analysis of Crime Data
#load data
data(UScrime)
#Default arguments are Robust prior for the regression parameters
#and constant prior over the model space
#Here we keep the 1000 most probable models a posteriori:
crime.Bvs<- Bvs(formula= y ~ ., data=UScrime, n.keep=1000)
#A look at the results:
crime.Bvs
summary(crime.Bvs)
#A plot with the posterior probabilities of the dimension of the
#true model:
plot(crime.Bvs, option="dimension")
#Two image plots of the conditional inclusion probabilities:
plot(crime.Bvs, option="conditional")
plot(crime.Bvs, option="not")
## End(Not run)