R: The horseshoe prior for the sparse normal means problem

HS.normal.means {horseshoe}

R Documentation

The horseshoe prior for the sparse normal means problem

Description

Apply the horseshoe prior to the normal means problem (i.e. linear regression with the design matrix equal to the identity matrix). Computes the posterior mean, median and credible intervals. There are options for empirical Bayes (estimate of tau and or Sigma2 plugged in) and full Bayes (truncated or non-truncated half-Cauchy on tau, Jeffrey's prior on Sigma2). For the full Bayes version, the truncated half-Cauchy prior is recommended by Van der Pas et al. (2016).

Usage

HS.normal.means(y, method.tau = c("fixed", "truncatedCauchy",
  "halfCauchy"), tau = 1, method.sigma = c("fixed", "Jeffreys"),
  Sigma2 = 1, burn = 1000, nmc = 5000, alpha = 0.05)

Arguments

`y`	The data. A `n*1` vector.
`method.tau`	Method for handling `\tau`. Select "fixed" to plug in an estimate of tau (empirical Bayes), "truncatedCauchy" for the half- Cauchy prior truncated to [1/n, 1], or "halfCauchy" for a non-truncated half-Cauchy prior. The truncated Cauchy prior is recommended over the non-truncated version.
`tau`	Use this argument to pass the (estimated) value of `\tau` in case "fixed" is selected for method.tau. Not necessary when method.tau is equal to"halfCauchy" or "truncatedCauchy". The function `HS.MMLE` can be used to compute an estimate of tau. The default (tau = 1) is not suitable for most purposes and should be replaced.
`method.sigma`	Select "fixed" for a fixed error variance, or "Jeffreys" to use Jeffrey's prior.
`Sigma2`	The variance of the data - only necessary when "fixed" is selected for method.sigma. The default (Sigma2 = 1) is not suitable for most purposes and should be replaced.
`burn`	Number of samples used for burn-in. Default is 1000.
`nmc`	Number of MCMC samples taken after burn-in. Default is 5000.
`alpha`	The level for the credible intervals. E.g. alpha = 0.05 yields 95% credible intervals

Details

The normal means model is:

y_i=\beta_i+\epsilon_i, \epsilon_i \sim N(0,\sigma^2)

And the horseshoe prior:

\beta_j \sim N(0,\sigma^2 \lambda_j^2 \tau^2)

\lambda_j \sim Half-Cauchy(0,1).

Estimates of \tau and \sigma^2 may be plugged in (empirical Bayes), or those parameters are equipped with hyperpriors (full Bayes).

Value

`BetaHat`	The posterior mean (horseshoe estimator) for each of the datapoints.
`LeftCI`	The left bounds of the credible intervals.
`RightCI`	The right bounds of the credible intervals.
`BetaMedian`	Posterior median of Beta, a `n` by 1 vector.
`Sigma2Hat`	Posterior mean of error variance `\sigma^2`. If method.sigma = "fixed" is used, this value will be equal to the user-selected value of Sigma2 passed to the function.
`TauHat`	Posterior mean of global scale parameter tau, a positive scalar. If method.tau = "fixed" is used, this value will be equal to the user-selected value of tau passed to the function.
`BetaSamples`	Posterior samples of Beta.
`TauSamples`	Posterior samples of tau.
`Sigma2Samples`	Posterior samples of Sigma2.

References

van der Pas, S.L., Szabo, B., and van der Vaart, A. (2017), Uncertainty quantification for the horseshoe (with discussion). Bayesian Analysis 12(4), 1221-1274.

van der Pas, S.L., Szabo, B., and van der Vaart A. (2017), Adaptive posterior contraction rates for the horseshoe. Electronic Journal of Statistics 10(1), 3196-3225.

Examples

#Empirical Bayes example with 20 signals, rest is noise
#Posterior mean for the signals is plotted
#And variable selection is performed using the credible intervals
#And the credible intervals are plotted
truth <- c(rep(0, 80), rep(8, 20))
data <-  truth + rnorm(100, 1)
tau.hat <- HS.MMLE(data, Sigma2 = 1)
res.HS1 <- HS.normal.means(data, method.tau = "fixed", tau = tau.hat,
method.sigma = "fixed", Sigma2 = 1)
#Plot the posterior mean against the data (signals in blue)
plot(data, res.HS1$BetaHat, col = c(rep("black", 80), rep("blue", 20)))
#Find the selected betas (ideally, the last 20 are equal to 1)
HS.var.select(res.HS1, data, method = "intervals")
#Plot the credible intervals
library(Hmisc)
xYplot(Cbind(res.HS1$BetaHat, res.HS1$LeftCI, res.HS1$RightCI) ~ 1:100)


#Full Bayes example with 20 signals, rest is noise
#Posterior mean for the signals is plotted
#And variable selection is performed using the credible intervals
#And the credible intervals are plotted
truth <- c(rep(0, 80), rep(8, 20))
data <-  truth + rnorm(100, 3)
res.HS2 <- HS.normal.means(data, method.tau = "truncatedCauchy", method.sigma = "Jeffreys")
#Plot the posterior mean against the data (signals in blue)
plot(data, res.HS2$BetaHat, col = c(rep("black", 80), rep("blue", 20)))
#Find the selected betas (ideally, the last 20 are equal to 1)
HS.var.select(res.HS2, data, method = "intervals")
#Plot the credible intervals
library(Hmisc)
xYplot(Cbind(res.HS2$BetaHat, res.HS2$LeftCI, res.HS2$RightCI) ~ 1:100)

[Package horseshoe version 0.2.0 Index]