horseshoe {horseshoe}R Documentation

Function to implement the horseshoe shrinkage prior in Bayesian linear regression

Description

This function employs the algorithm proposed in Bhattacharya et al. (2016). The global-local scale parameters are updated via a slice sampling scheme given in the online supplement of Polson et al. (2014). Two different algorithms are used to compute posterior samples of the p*1 vector of regression coefficients \beta. The method proposed in Bhattacharya et al. (2016) is used when p>n, and the algorithm provided in Rue (2001) is used for the case p<=n. The function includes options for full hierarchical Bayes versions with hyperpriors on all parameters, or empirical Bayes versions where some parameters are taken equal to a user-selected value.

Usage

horseshoe(y, X, method.tau = c("fixed", "truncatedCauchy", "halfCauchy"),
  tau = 1, method.sigma = c("fixed", "Jeffreys"), Sigma2 = 1,
  burn = 1000, nmc = 5000, thin = 1, alpha = 0.05)

Arguments

y

Response, a n*1 vector.

X

Matrix of covariates, dimension n*p.

method.tau

Method for handling \tau. Select "truncatedCauchy" for full Bayes with the Cauchy prior truncated to [1/p, 1], "halfCauchy" for full Bayes with the half-Cauchy prior, or "fixed" to use a fixed value (an empirical Bayes estimate, for example).

tau

Use this argument to pass the (estimated) value of \tau in case "fixed" is selected for method.tau. Not necessary when method.tau is equal to"halfCauchy" or "truncatedCauchy". The default (tau = 1) is not suitable for most purposes and should be replaced.

method.sigma

Select "Jeffreys" for full Bayes with Jeffrey's prior on the error variance \sigma^2, or "fixed" to use a fixed value (an empirical Bayes estimate, for example).

Sigma2

A fixed value for the error variance \sigma^2. Not necessary when method.sigma is equal to "Jeffreys". Use this argument to pass the (estimated) value of Sigma2 in case "fixed" is selected for method.sigma. The default (Sigma2 = 1) is not suitable for most purposes and should be replaced.

burn

Number of burn-in MCMC samples. Default is 1000.

nmc

Number of posterior draws to be saved. Default is 5000.

thin

Thinning parameter of the chain. Default is 1 (no thinning).

alpha

Level for the credible intervals. For example, alpha = 0.05 results in 95% credible intervals.

Details

The model is:

y=X\beta+\epsilon, \epsilon \sim N(0,\sigma^2)

The full Bayes version of the horseshoe, with hyperpriors on both \tau and \sigma^2 is:

\beta_j \sim N(0,\sigma^2 \lambda_j^2 \tau^2)

\lambda_j \sim Half-Cauchy(0,1), \tau \sim Half-Cauchy (0,1)

\sigma^2 \sim 1/\sigma^2

There is an option for a truncated Half-Cauchy prior (truncated to [1/p, 1]) on \tau. Empirical Bayes versions are available as well, where \tau and/or \sigma^2 are taken equal to fixed values, possibly estimated using the data.

Value

BetaHat

Posterior mean of Beta, a p by 1 vector.

LeftCI

The left bounds of the credible intervals.

RightCI

The right bounds of the credible intervals.

BetaMedian

Posterior median of Beta, a p by 1 vector.

Sigma2Hat

Posterior mean of error variance \sigma^2. If method.sigma = "fixed" is used, this value will be equal to the user-selected value of Sigma2 passed to the function.

TauHat

Posterior mean of global scale parameter tau, a positive scalar. If method.tau = "fixed" is used, this value will be equal to the user-selected value of tau passed to the function.

BetaSamples

Posterior samples of Beta.

TauSamples

Posterior samples of tau.

Sigma2Samples

Posterior samples of Sigma2.

References

Bhattacharya A., Chakraborty A., and Mallick B.K (2016), Fast sampling with Gaussian scale-mixture priors in high-dimensional regression. Biometrika 103(4), 985–991.

Polson, N.G., Scott, J.G. and Windle, J. (2014) The Bayesian Bridge. Journal of Royal Statistical Society, B, 76(4), 713-733.

Rue, H. (2001). Fast sampling of Gaussian Markov random fields. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63, 325–338.

Carvalho, C. M., Polson, N. G., and Scott, J. G. (2010), The Horseshoe Estimator for Sparse Signals. Biometrika 97(2), 465–480.

See Also

HS.normal.means for a faster version specifically for the sparse normal means problem (design matrix X equal to identity matrix) and HS.post.mean for a fast way to estimate the posterior mean in the sparse normal means problem when a value for tau is available.

Examples

## Not run: #In this example, there are no relevant predictors
#20 observations, 30 predictors (betas)
y <- rnorm(20)
X <- matrix(rnorm(20*30) , 20)
res <- horseshoe(y, X, method.tau = "truncatedCauchy", method.sigma = "Jeffreys")

plot(y, X%*%res$BetaHat) #plot predicted values against the observed data
res$TauHat #posterior mean of tau
HS.var.select(res, y, method = "intervals") #selected betas
#Ideally, none of the betas is selected (all zeros)
#Plot the credible intervals
library(Hmisc)
xYplot(Cbind(res$BetaHat, res$LeftCI, res$RightCI) ~ 1:30)

## End(Not run)

## Not run:  #The horseshoe applied to the sparse normal means problem
# (note that HS.normal.means is much faster in this case)
X <- diag(100)
beta <- c(rep(0, 80), rep(8, 20))
y <- beta + rnorm(100)
res2 <- horseshoe(y, X, method.tau = "truncatedCauchy", method.sigma = "Jeffreys")
#Plot predicted values against the observed data (signals in blue)
plot(y, X%*%res2$BetaHat, col = c(rep("black", 80), rep("blue", 20)))
res2$TauHat #posterior mean of tau
HS.var.select(res2, y, method = "intervals") #selected betas
#Ideally, the final 20 predictors are selected
#Plot the credible intervals
library(Hmisc)
xYplot(Cbind(res2$BetaHat, res2$LeftCI, res2$RightCI) ~ 1:100)

## End(Not run)


[Package horseshoe version 0.2.0 Index]