probiths {horseshoenlm} | R Documentation |
Horseshoe shrinkage prior in Bayesian Probit regression
Description
This function employs the algorithm provided by Makalic and Schmidt (2016) for binary probit model to fit Bayesian probit regression. The observations are updated according to the data augmentation of approach of Albert and Chib (1993).
The model is:
z_i
is response either 1 or 0,
\log \Pr(z_i = 1) = \Phi(X\beta), \Phi \sim N(0,\sigma^2)
.
Usage
probiths(
z,
X,
method.tau = c("fixed", "truncatedCauchy", "halfCauchy"),
tau = 1,
burn = 1000,
nmc = 5000,
thin = 1,
alpha = 0.05,
Xtest = NULL
)
Arguments
z |
Response, a |
X |
Matrix of covariates, dimension |
method.tau |
Method for handling |
tau |
Use this argument to pass the (estimated) value of |
burn |
Number of burn-in MCMC samples. Default is 1000. |
nmc |
Number of posterior draws to be saved. Default is 5000. |
thin |
Thinning parameter of the chain. Default is 1 (no thinning). |
alpha |
Level for the credible intervals. For example, alpha = 0.05 results in 95% credible intervals. |
Xtest |
test design matrix. |
Value
ProbHat |
Predictive probability |
BetaHat |
Posterior mean of Beta, a |
LeftCI |
The left bounds of the credible intervals |
RightCI |
The right bounds of the credible intervals |
BetaMedian |
Posterior median of Beta, a |
LambdaHat |
Posterior samples of |
TauHat |
Posterior mean of global scale parameter tau, a positive scalar |
BetaSamples |
Posterior samples of |
TauSamples |
Posterior samples of |
LikelihoodSamples |
Posterior samples of likelihood |
DIC |
Devainace Information Criterion of the fitted model |
WAIC |
Widely Applicable Information Criterion |
References
Stephanie van der Pas, James Scott, Antik Chakraborty and Anirban Bhattacharya (2016). horseshoe: Implementation of the Horseshoe Prior. R package version 0.1.0. https://CRAN.R-project.org/package=horseshoe
Enes Makalic and Daniel Schmidt (2016). High-Dimensional Bayesian Regularised Regression with the BayesReg Package arXiv:1611.06649
Albert, J. H., & Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American statistical Association, 88(422), 669-679.
Examples
burnin <- 100
nmc <- 200
thin <- 1
y.sd <- 1 # statndard deviation of the response
p <- 200 # number of predictors
ntrain <- 250 # training size
ntest <- 100 # test size
n <- ntest + ntrain # sample size
q <- 10 # number of true predictos
beta.t <- c(sample(x = c(1, -1), size = q, replace = TRUE), rep(0, p - q))
x <- mvtnorm::rmvnorm(n, mean = rep(0, p))
zmean <- x %*% beta.t
y <- rnorm(n, mean = zmean, sd = y.sd)
z <- ifelse(y > 0, 1, 0)
X <- scale(as.matrix(x)) # standarization
z <- as.numeric(as.matrix(c(z)))
# Training set
ztrain <- z[1:ntrain]
Xtrain <- X[1:ntrain, ]
# Test set
ztest <- z[(ntrain + 1):n]
Xtest <- X[(ntrain + 1):n, ]
posterior.fit <- probiths(z = ztrain, X = Xtrain, method.tau = "halfCauchy",
burn = burnin, nmc = nmc, thin = 1,
Xtest = Xtest)
posterior.fit$BetaHat
# Posterior processing to recover the significant predictors
cluster <- kmeans(abs(posterior.fit$BetaHat), centers = 2)$cluster # return cluster indices
cluster1 <- which(cluster == 1)
cluster2 <- which(cluster == 2)
min.cluster <- ifelse(length(cluster1) < length(cluster2), 1, 2)
which(cluster == min.cluster) # this matches with the true variables