R: Fitting Bayesian Poisson Regression

sample_bpr {bpr}

R Documentation

Fitting Bayesian Poisson Regression

Description

The function generates draws from the posterior distribution of the coefficients of Poisson regression models. The method allows for Gaussian and horseshoe (Carvalho et al, 2010) prior distributions, and relies on a Metropolis-Hastings or importance sampler algorithm.

Usage

sample_bpr(
  formula = NULL,
  data = NULL,
  iter,
  burnin = NULL,
  prior = list(type = "gaussian", b = NULL, B = NULL, tau = NULL),
  pars = list(method = "MH", max_dist = 50, max_r = NULL, max_dist_burnin = 1e+06),
  state = NULL,
  thin = 1,
  verbose = TRUE,
  seed = NULL,
  nchains = 1,
  perc_burnin = 0.25
)

Arguments

`formula`	an object of class "formula": a symbolic description of the model to be fitted.
`data`	data frame or matrix containing the variables in the model.
`iter`	number of algorithm iterations.
`burnin`	(optional) a positive integer specifying the length of the burn-in. If a value > 1 is provided, the first `burnin` iterations use a different tuning parameter in order to better explore the parameter space.
`prior`	a named list of parameters to select prior type and parameters, with arguments: `type` : string specifying whether an informative Gaussian (`"gaussian"`) or a horseshoe (`"horseshoe"`) prior should be used. Default is `"gaussian"`. `b, B` : (optional) if a Gaussian prior is used, the mean and covariance matrix passed as prior parameters. If not specified, the prior on the regression parameters is centered at zero, with independent N(0,2) components. `tau` : if a horseshoe prior is used, the global shrinkage parameter tau has to be fixed. This can be seen as an empirical Bayes approach, and allows to speed convergence and avoid potential convergence issues that often occur when it is sampled. In general, the parameter can be interpreted as a measure of sparsity, and it should be fixed to small values. See van der Pas et al. (2017) for a discussion.
`pars`	a named list of parameters to select algorithm type and tuning parameters, with arguments: `method` : the type of algorithm used. Default is a Metropolis-Hastings algorithm (`"MH"`), the alternative is an importance sampler algorithm (`"IS"`). `max_dist` : tuning parameter controlling the "distance" of the approximation to the true target posterior. For the Metropolis-Hastings algorithm, it can be used to balance acceptance rate and autocorrelation of the chains. As a general indication, larger values are needed for increasing size/dimension of the data to obtain good results. #' `max_r` : (optional) additional tuning parameter which sets an upper bound for the parameters r controlling the approximation. `max_dist_burnin` : if `burnin` is specified, the tuning parameter used for the first part of the chain. A very large value is sometimes useful to explore the parameter space (especially if the chains are initialized very far from their stationary distribution).
`state`	optional vector providing the starting points of the chains.
`thin`	a positive integer specifying the period for saving samples. The default is 1.
`verbose`	logical (default = TRUE) indicating whether to print messages on the progress of the algorithm and possible convergence issues.
`seed`	(optional) positive integer: the seed of random number generator.
`nchains`	(optional) positive integer specifying the number of Markov chains. The default is 1.
`perc_burnin`	(default = 0.25) percentage of the chain to be discarded to perform inference. If both burnin and perc_burnin are specified, the most conservative burn-in is considered.

Details

This function fits a Bayesian Poisson regression model with Gaussian prior distributions on the regression coefficients:

Y ~ Poisson(\lambda) , \lambda = exp{X \beta}

where Y is a size n vector of counts and X is a n x p matrix of coefficients; and (\beta | - ) has a Gaussian distribution (possibly conditionally on some parameters).

Specifically, the function allows for informative Gaussian prior distribution on the parameters, i.e. (\beta_1,...,\beta_p) ~ N_p(b, B), and for a horseshoe prior distribution (Carvalho et al, 2010). The horseshoe prior is a scale mixture of normals, which is typically used in high-dimension settings to induce sparsity and regularization of the coefficients.

The implemented Metropolis-Hastings and importance sampler exploit as proposal density a multivariate Gaussian approximation of the posterior distribution. Such proposal is based on the convergence of the negative binomial distribution to the Poisson distribution and on the Polya-gamma data augmentation of Polson et al. (2013).

The output of the sampling is an object of class poisreg and admits class-specific methods to perform inference.
The function summary.poisreg can be used to obtain or print a summary of the results and of the algorithm diagnostics.
The function mcmc_diagnostics can be used to obtain or print convergence diagnostics for the sampled chains.
The function plot.poisreg prints the trace of the sampled values and a density estimate of the regression coefficients. See plot.mcmc.
The function posterior_predictive can be used to compute the posterior predictive distributions to check the model. See also the related function plot.ppc.

Value

An object of S3 class poisreg containing the results of the sampling.
poisreg is a list containing at least the following elements:

sim : list of the results of the sampling. It contains the following elements:

beta : mcmc object of posterior draws of the regression coefficients.
r : the sequence of adaptive tuning parameters used in each iteration.
time : the total amount of time to perform the simulation.

formula : the formula object used.

data : list with elements the matrix of covariates X and response variable y.

state : the starting points of the chain.

burnin : length of the used burn-in.

prior : whether a Gaussian or horseshoe prior was used.

prior_pars : prior parameters.

thin : thinning frequency passed to the thin parameter.

nchains : number of chains. If nchains was chosen >1, the output list will also include additional numbered sim elements, one for each sampled chain.

perc_burnin : percentage of the chain used as burn-in.

References

Carvalho, C., Polson, N., & Scott, J. (2010). The horseshoe estimator for sparse signals. Biometrika, 97(2), 465-480.

van der Pas, S., Szabo, B. and van der Vaart, A. (2017), Adaptive posterior contractionrates for the horseshoe, Electronic Journal of Statistics, 11(2), 3196-3225.

Examples

require(MASS) # load the data set
head(epil)

fit = sample_bpr( y ~  lbase*trt + lage + V4, data = epil, 
                   iter = 1000)
                   
summary(fit)    # summary of posterior inference
mcmc_diagnostics(fit)    # summary of MCMC convergence diagnostics

plot(fit)    


## Examples with different options
# Select prior parameters and set tuning parameter 
fit2 = sample_bpr( y ~  lbase*trt + lage + V4, data = epil, 
                    iter = 1000, 
                    prior = list( type = "gaussian", b = rep(0, 6), 
                                  B = diag(6) * 3 ),
                    pars = list( max_dist = 10 ))
                    
# Simulate multiple chains and merge outputs after checking convergence
fit3 = sample_bpr( y ~  lbase*trt + lage + V4, data = epil, 
                    iter = 1000, 
                    nchains = 4, thin = 2)
# fit3 now contains additional elements ($sim2, $sim3, $sim4)

mcmc_diagnostics(fit3)     
# the Gelman-Rubin diagnostics confirms convergence of the 4 
# independent chains to the same stationary distribution

fit3b = merge_sim(fit3) 
str(fit3b$sim)    
# fit 3b contains only one MCMC chain of length 1500 
# (after thinning and burn-in)


## introduce more variables and use regularization
epil2 <- epil[epil$period == 1, ]
epil2["period"] <- rep(0, 59); epil2["y"] <- epil2["base"]
epil["time"] <- 1; epil2["time"] <- 4
epil2 <- rbind(epil, epil2)
epil2$pred <- unclass(epil2$trt) * (epil2$period > 0) 
epil2$subject <- factor(epil2$subject)
epil3 <- aggregate(epil2, list(epil2$subject, epil2$period > 0),
                   function(x) if(is.numeric(x)) sum(x) else x[1])
                   epil3$pred <- factor(epil3$pred,
                   labels = c("base", "placebo", "drug"))
contrasts(epil3$pred) <- structure(contr.sdif(3),
                         dimnames = list(NULL, c("placebo-base", "drug-placebo")))
                         
fit4 = sample_bpr(y ~ pred + factor(subject), data = epil3,
                pars = list(max_dist = 0.3),
                prior = list(type = "horseshoe", tau = 2),
                iter = 3000, burnin = 1000)
summary(fit4)
mcmc_diagnostics(fit4)
plot(posterior_predictive(fit4), stats = c("mean", "sd", "max"))

[Package bpr version 1.0.8 Index]