R: Horseshoe shrinkage prior in Bayesian Logistic regression

logiths {horseshoenlm}

R Documentation

Horseshoe shrinkage prior in Bayesian Logistic regression

Description

This function employs the algorithm provided by Makalic and Schmidt (2016) for binary logistic model to fit Bayesian logistic regression. The observations are updated according to the Polya-Gamma data augmentation of approach of Polson, Scott, and Windle (2014).

Usage

logiths(
  z,
  X,
  method.tau = c("fixed", "truncatedCauchy", "halfCauchy"),
  tau = 1,
  burn = 1000,
  nmc = 5000,
  thin = 1,
  alpha = 0.05,
  Xtest = NULL
)

Arguments

`z`	Response, a `n*1` vector of 1 or 0.
`X`	Matrix of covariates, dimension `n*p`.
`method.tau`	Method for handling `\tau`. Select "truncatedCauchy" for full Bayes with the Cauchy prior truncated to [1/p, 1], "halfCauchy" for full Bayes with the half-Cauchy prior, or "fixed" to use a fixed value (an empirical Bayes estimate, for example).
`tau`	Use this argument to pass the (estimated) value of `\tau` in case "fixed" is selected for method.tau. Not necessary when method.tau is equal to "halfCauchy" or "truncatedCauchy". The default (tau = 1) is not suitable for most purposes and should be replaced.
`burn`	Number of burn-in MCMC samples. Default is 1000.
`nmc`	Number of posterior draws to be saved. Default is 5000.
`thin`	Thinning parameter of the chain. Default is 1 (no thinning).
`alpha`	Level for the credible intervals. For example, alpha = 0.05 results in 95% credible intervals.
`Xtest`	test design matrix.

Details

The model is: z_i is response either 1 or 0, \log \Pr(z_i = 1) = logit^{-1}(X\beta).

Value

`ProbHat`	Predictive probability
`BetaHat`	Posterior mean of Beta, a `p` by 1 vector
`LeftCI`	The left bounds of the credible intervals
`RightCI`	The right bounds of the credible intervals
`BetaMedian`	Posterior median of Beta, a `p` by 1 vector
`LambdaHat`	Posterior samples of `\lambda`, a `p*1` vector
`TauHat`	Posterior mean of global scale parameter tau, a positive scalar
`BetaSamples`	Posterior samples of `\beta`
`TauSamples`	Posterior samples of `\tau`
`LikelihoodSamples`	Posterior samples of likelihood
`DIC`	Devainace Information Criterion of the fitted model
`WAIC`	Widely Applicable Information Criterion

References

Stephanie van der Pas, James Scott, Antik Chakraborty and Anirban Bhattacharya (2016). horseshoe: Implementation of the Horseshoe Prior. R package version 0.1.0. https://CRAN.R-project.org/package=horseshoe

Enes Makalic and Daniel Schmidt (2016). High-Dimensional Bayesian Regularised Regression with the BayesReg Package arXiv:1611.06649

Polson, N.G., Scott, J.G. and Windle, J. (2014) The Bayesian Bridge. Journal of Royal Statistical Society, B, 76(4), 713-733.

Examples


burnin <- 100
nmc    <- 500
thin <- 1

 
p <- 100  # number of predictors
ntrain <- 250  # training size
ntest  <- 100   # test size
n <- ntest + ntrain  # sample size
q <- 10   # number of true predictos

beta.t <- c(sample(x = c(1, -1), size = q, replace = TRUE), rep(0, p - q))  
x <- mvtnorm::rmvnorm(n, mean = rep(0, p), sigma = diag(p))    

zmean <- x %*% beta.t
z <- rbinom(n, size = 1, prob = boot::inv.logit(zmean))
X <- scale(as.matrix(x))  # standarization


# Training set
ztrain <- z[1:ntrain]
Xtrain  <- X[1:ntrain, ]

# Test set
ztest <- z[(ntrain + 1):n]
Xtest  <- X[(ntrain + 1):n, ]

posterior.fit <- logiths(z = ztrain, X = Xtrain, method.tau = "halfCauchy",
                         burn = burnin, nmc = nmc, thin = 1,
                         Xtest = Xtest)
                             
posterior.fit$BetaHat


# Posterior processing to recover the true predictors
cluster <- kmeans(abs(posterior.fit$BetaHat), centers = 2)$cluster
cluster1 <- which(cluster == 1)
cluster2 <- which(cluster == 2)
min.cluster <- ifelse(length(cluster1) < length(cluster2), 1, 2)
which(cluster == min.cluster)  # this matches with the true variables

[Package horseshoenlm version 0.0.6 Index]