bayesreg {bayesreg} | R Documentation |
Fitting Bayesian Regression Models with Continuous Shrinkage Priors
Description
Fit a linear or logistic regression model using Bayesian continuous shrinkage prior distributions. Handles ridge, lasso, horseshoe and horseshoe+ regression with logistic,
Gaussian, Laplace, Student-t, Poisson or geometric distributed targets. See bayesreg-package
for more details on the features available in this package.
Usage
bayesreg(
formula,
data,
model = "normal",
prior = "ridge",
n.samples = 1000,
burnin = 1000,
thin = 5,
t.dof = 5
)
Arguments
formula |
An object of class " |
data |
A data frame containing the variables in the model. |
model |
The distribution of the target (y) variable. Continuous or numeric variables can be distributed as per a Gaussian distribution ( |
prior |
Which continuous shrinkage prior distribution over the regression coefficients to use. Options include ridge regression
( |
n.samples |
Number of posterior samples to generate. |
burnin |
Number of burn-in samples. |
thin |
Desired level of thinning. |
t.dof |
Degrees of freedom for the Student-t distribution. |
Value
An object with S3 class "bayesreg"
containing the results of the sampling process, plus some additional information.
beta |
Posterior samples the regression model coefficients. |
beta0 |
Posterior samples of the intercept parameter. |
sigma2 |
Posterior samples of the square of the scale parameter; for Gaussian distributed targets this is equal to the variance. For binary targets this is empty. |
mu.beta |
The mean of the posterior samples for the regression coefficients. |
mu.beta0 |
The mean of the posterior samples for the intercept parameter. |
mu.sigma2 |
The mean of the posterior samples for squared scale parameter. |
tau2 |
Posterior samples of the global shrinkage parameter. |
t.stat |
Posterior t-statistics for each regression coefficient. |
var.ranks |
Ranking of the covariates by their importance, with "1" denoting the most important covariate. |
log.l |
The log-likelihood at the posterior means of the model parameters |
waic |
The Widely Applicable Information Criterion (WAIC) score for the model |
waic.dof |
The effective degrees-of-freedom of the model, as estimated by the WAIC. |
The returned object also stores the parameters/options used to run bayesreg
:
formula |
The object of type " |
model |
The distribution of the target (y) variable. |
prior |
The shrinkage prior used to fit the model. |
n.samples |
The number of samples generated from the posterior distribution. |
burnin |
The number of burnin samples that were generated. |
thin |
The level of thinning. |
n |
The sample size of the data used to fit the model. |
p |
The number of covariates in the fitted model. |
Details
Draws a series of samples from the posterior distribution of a linear (Gaussian, Laplace or Student-t) or generalized linear (logistic binary, Poisson, geometric) regression model with specified continuous shrinkage prior distribution (ridge regression, lasso, horseshoe and horseshoe+) using Gibbs sampling. The intercept parameter is always included, and is never penalised.
While only n.samples
are returned, the total number of samples generated is equal to burnin
+n.samples
*thin
. To generate the samples
of the regression coefficients, the code will use either Rue's algorithm (when the number of samples is twice the number of covariates) or the algorithm of
Bhattacharya et al. as appropriate. Factor variables are automatically grouped together and
additional shrinkage is applied to the set of indicator variables to which they expand.
Note
To cite this toolbox please reference:
Makalic, E. & Schmidt, D. F. High-Dimensional Bayesian Regularised Regression with the BayesReg Package arXiv:1611.06649 [stat.CO], 2016 https://arxiv.org/pdf/1611.06649.pdf
A MATLAB implementation of the bayesreg function is also available from:
Copyright (C) Daniel F. Schmidt and Enes Makalic, 2016-2021
References
Makalic, E. & Schmidt, D. F. High-Dimensional Bayesian Regularised Regression with the BayesReg Package arXiv:1611.06649 [stat.CO], 2016 https://arxiv.org/pdf/1611.06649.pdf
Park, T. & Casella, G. The Bayesian Lasso Journal of the American Statistical Association, Vol. 103, pp. 681-686, 2008
Carvalho, C. M.; Polson, N. G. & Scott, J. G. The horseshoe estimator for sparse signals Biometrika, Vol. 97, 465-480, 2010
Makalic, E. & Schmidt, D. F. A Simple Sampler for the Horseshoe Estimator IEEE Signal Processing Letters, Vol. 23, pp. 179-182, 2016 https://arxiv.org/pdf/1508.03884v4.pdf
Bhadra, A.; Datta, J.; Polson, N. G. & Willard, B. The Horseshoe+ Estimator of Ultra-Sparse Signals Bayesian Analysis, 2016
Polson, N. G.; Scott, J. G. & Windle, J. Bayesian inference for logistic models using Polya-Gamma latent variables Journal of the American Statistical Association, Vol. 108, 1339-1349, 2013
Rue, H. Fast sampling of Gaussian Markov random fields Journal of the Royal Statistical Society (Series B), Vol. 63, 325-338, 2001
Bhattacharya, A.; Chakraborty, A. & Mallick, B. K. Fast sampling with Gaussian scale-mixture priors in high-dimensional regression arXiv:1506.04778, 2016
Schmidt, D.F. & Makalic, E. Bayesian Generalized Horseshoe Estimation of Generalized Linear Models ECML PKDD 2019: Machine Learning and Knowledge Discovery in Databases. pp 598-613, 2019
Stan Development Team, Stan Reference Manual (Version 2.26), Section 15.4, "Effective Sample Size", https://mc-stan.org/docs/2_18/reference-manual/effective-sample-size-section.html
See Also
The prediction function predict.bayesreg
and summary function summary.bayesreg
Examples
# -----------------------------------------------------------------
# Example 1: Gaussian regression
X = matrix(rnorm(100*20),100,20)
b = matrix(0,20,1)
b[1:5] = c(5,4,3,2,1)
y = X %*% b + rnorm(100, 0, 1)
df <- data.frame(X,y)
rv.lm <- lm(y~.,df) # Regular least-squares
summary(rv.lm)
rv.hs <- bayesreg(y~.,df,prior="hs") # Horseshoe regression
rv.hs.s <- summary(rv.hs)
# Expected squared prediction error for least-squares
coef_ls = coef(rv.lm)
as.numeric(sum( (as.matrix(coef_ls[-1]) - b)^2 ) + coef_ls[1]^2)
# Expected squared prediction error for horseshoe
as.numeric(sum( (rv.hs$mu.beta - b)^2 ) + rv.hs$mu.beta0^2)
# -----------------------------------------------------------------
# Example 2: Gaussian v Student-t robust regression
X = 1:10;
y = c(-0.6867, 1.7258, 1.9117, 6.1832, 5.3636, 7.1139, 9.5668, 10.0593, 11.4044, 6.1677);
df = data.frame(X,y)
# Gaussian ridge
rv.G <- bayesreg(y~., df, model = "gaussian", prior = "ridge", n.samples = 1e3)
# Student-t ridge
rv.t <- bayesreg(y~., df, model = "t", prior = "ridge", t.dof = 5, n.samples = 1e3)
# Plot the different estimates with credible intervals
plot(df$X, df$y, xlab="x", ylab="y")
yhat_G <- predict(rv.G, df, bayes.avg=TRUE)
lines(df$X, yhat_G[,1], col="blue", lwd=2.5)
lines(df$X, yhat_G[,3], col="blue", lwd=1, lty="dashed")
lines(df$X, yhat_G[,4], col="blue", lwd=1, lty="dashed")
yhat_t <- predict(rv.t, df, bayes.avg=TRUE)
lines(df$X, yhat_t[,1], col="darkred", lwd=2.5)
lines(df$X, yhat_t[,3], col="darkred", lwd=1, lty="dashed")
lines(df$X, yhat_t[,4], col="darkred", lwd=1, lty="dashed")
legend(1,11,c("Gaussian","Student-t (dof=5)"),lty=c(1,1),col=c("blue","darkred"),
lwd=c(2.5,2.5), cex=0.7)
## Not run:
# -----------------------------------------------------------------
# Example 3: Poisson/geometric regression example
X = matrix(rnorm(100*20),100,5)
b = c(0.5,-1,0,0,1)
nu = X%*%b + 1
y = rpois(lambda=exp(nu),n=length(nu))
df <- data.frame(X,y)
# Fit a Poisson regression
rv.pois=bayesreg(y~.,data=df,model="poisson",prior="hs", burnin=1e4, n.samples=1e4)
summary(rv.pois)
# Fit a geometric regression
rv.geo=bayesreg(y~.,data=df,model="geometric",prior="hs", burnin=1e4, n.samples=1e4)
summary(rv.geo)
# Compare the two models in terms of their WAIC scores
cat(sprintf("Poisson regression WAIC=%g vs geometric regression WAIC=%g",
rv.pois$waic, rv.geo$waic))
# Poisson is clearly preferred to geometric, which is good as data is generated from a Poisson!
# -----------------------------------------------------------------
# Example 4: Logistic regression on spambase
data(spambase)
# bayesreg expects binary targets to be factors
spambase$is.spam <- factor(spambase$is.spam)
# First take a subset of the data (1/10th) for training, reserve the rest for testing
spambase.tr = spambase[seq(1,nrow(spambase),10),]
spambase.tst = spambase[-seq(1,nrow(spambase),10),]
# Fit a model using logistic horseshoe for 2,000 samples
rv <- bayesreg(is.spam ~ ., spambase.tr, model = "logistic", prior = "horseshoe", n.samples = 2e3)
# Summarise, sorting variables by their ranking importance
rv.s <- summary(rv,sort.rank=TRUE)
# Make predictions about testing data -- get class predictions and class probabilities
y_pred <- predict(rv, spambase.tst, type='class')
# Check how well did our predictions did by generating confusion matrix
table(y_pred, spambase.tst$is.spam)
# Calculate logarithmic loss on test data
y_prob <- predict(rv, spambase.tst, type='prob')
cat('Neg Log-Like for no Bayes average, posterior mean estimates: ', sum(-log(y_prob[,1])), '\n')
y_prob <- predict(rv, spambase.tst, type='prob', sum.stat="median")
cat('Neg Log-Like for no Bayes average, posterior median estimates: ', sum(-log(y_prob[,1])), '\n')
y_prob <- predict(rv, spambase.tst, type='prob', bayes.avg=TRUE)
cat('Neg Log-Like for Bayes average: ', sum(-log(y_prob[,1])), '\n')
## End(Not run)