R: Fitting Bayesian Constrained Generalised Additive Models

bcgam {bcgam}

R Documentation

Fitting Bayesian Constrained Generalised Additive Models

Description

bcgam is used to fit generalised partial linear regression models using a Bayesian approach, where shape and smoothness constraints are imposed on nonparametrically modelled predictors through shape-restricted splines, and no constraints are imposed on optional parametrically modelled covariates.

Usage

bcgam(formula, family = gaussian(), data = NULL, nloop = 10000,
  burnin = trunc(nloop/10))

Arguments

`formula`	an object of class `formula` that contains a symbolic description of the model to be fitted. It has the form "response~nonparam+param", where "nonparam" are the predictors to be modelled nonparametrically and "param" are the optional predictors to be modelled parametrically. The user has to specify the relationship between the systematic component `\eta` and any nonparametrically modelled predictor `x`. The options are: sm.incr(x): `x` is smooth and increasing in `\eta`. See `sm.incr` for more details. sm.decr(x): `x` is smooth and decreasing in `\eta`. See `sm.decr` for more details. sm.conv(x): `x` is smooth and convex in `\eta`. See `sm.conv` for more details. sm.conc(x): `x` is smooth and concave in `\eta`. See `sm.conc` for more details. sm.incr.conv(x): `x` is smooth, increasing and convex in `\eta`. See `sm.incr.conv` for more details. sm.decr.conv(x): `x` is smooth, decreasing and convex in `\eta`. See `sm.decr.conv` for more details. sm.incr.conc(x): `x` is smooth, increasing and concave in `\eta`. See `sm.incr.conc` for more details. sm.decr.conc(x): `x` is smooth, decreasing and concave in `\eta`. See `sm.decr.conc` for more details.
`family`	a description of the error distribution and link function to be used in the model. This accepts only the following families: `"gaussian"` (normal errors model), `"binomial"` (logistic model), and `"poisson"` (Poisson model). See `family` for details of family functions.
`data`	an optional data frame, list or environment containing the variables in the model. The default is `"NULL"`.
`nloop`	length of the MCMC. The default is `10000`.
`burnin`	a positive value, smaller than `nloop`, that indicates the amount of initial MCMC values to be discarded. By default, it burns-in the first 10% chain values.

Details

We assume the additive model for each systematic component element \eta_i given by

\eta_i = f_1(x_{1i}) + ... + f_L(x_{Li}) + z_i'\gamma,

where z_i is a vector of variables to be modelled parametrically and \gamma is a parameter vector. The functions f_l of the continuous predictors x_l are assumed to be smooth, and shape restrictions such as monotonicity and/or convexity might be assumed. Generally, the vector \eta=(\eta_1, ..., \eta_n)' is approximated by

\sum_{j=1}^{m_1}\beta_{1j}\delta_{1j} + ... + \sum_{j=1}^{m_L}\beta_{Lj}\delta_{Lj} + \sum_{j=1}^{p}\alpha_j\nu_j,

where \beta_{lj} \ge 0 for all l,j. The \delta's represent the basis vectors used to approximate the f functions. The \nu_j consists of the one vector and the vectors of the observed values of covariates to be modelled parametrically. In addition, when f_l is assumed to be convex, the x_l vector is included as one of the \nu_j.

A Bayesian approach is considered for estimation and inference of the model above. As the \beta coefficients are constrained to be non-negative, then a gamma prior with hyperparameters c_{l1} (shape) and c_{l2} (scale) is assumed for each \beta_{lj}. The values c_{l1} and c_{l2} are chosen in a way that a large variance can be combined with a small mean, so that it is close to a non-informative gamma prior. Further, a normal prior distribution with mean zero and large variance M is considered for the \alpha coefficients.

bcgam makes use of the system "nimble" to set the Bayesian (hierarchical) model and compute the MCMC. Hence, "nimble" has to be loaded in R to be able to use bcgam. Information about how to download and install "nimble" can be found at https://r-nimble.org.

Value

bcgam returns an object of class "bcgam".

The generic routines summary and print are used to obtain and print a summary of the results. Further, 2D and 3D plots can be created using plot and persp, respectively.

An object of class "bcgam" is a list containing at least the following components:

`coefs`	a vector of posterior means of the `\alpha` and `\beta` coefficients.
`sd.coefs`	a vector of posterior standard errors of the `\alpha` and `\beta` coefficients.
`etahat`	a vector of posterior means of the systematic component `\eta`.
`muhat`	a vector of posterior means of `\mu`. `\mu` is obtained by transforming `\eta` using the inverse of the link function.
`alpha.sims`	a matrix of posterior samples (after burn-in) of the `\alpha` coefficients.
`beta.sims`	a matrix of posterior samples (after burn-in) of the `\beta` coefficients.
`sigma.sims`	a matrix of posterior samples (after burn-in) of `\sigma`. This is only shown when `family="gaussian"`.
`eta.sims`	a matrix of posterior samples (after burn-in) of the systematic component `\eta`.
`mu.sims`	a matrix of posterior samples (after burn-in) of `\mu`. `\mu` is obtained by transforming `\eta` using the inverse of the link function.
`delta`	a matrix that contains the basis functions `\delta` in its columns.
`zmat`	a matrix that contains the vectors `\nu` in its columns.
`knots`	a list of the knots.
`shapes`	a list of numbers that indicate the shape categories.
`sps`	a character vector of the space parameter used to create the knots.
`nloop`	the length of the MCMC.
`burnin`	the burn-in value.
`family`	the family parameter.
`y`	the response variable.

Author(s)

Cristian Oliva-Aviles and Mary C. Meyer

References

Meyer, M. C. (2008) Inference using shape-restricted regression splines. Annals of Applied Statistics 2(3), 1013-1033.

Meyer, M. C., Hackstadt, A. J., and Hoeting J. A. (2011) Bayesian estimation and inference for generalised partial linear models using shape-restricted splines. Journal of Nonparametric Statistics 23(4), 867-884.

Examples

## Not run: 
## Example 1 (gaussian)
data(duncan)

bcgam.fit <- bcgam(income~sm.incr(prestige, space="E")+sm.conv(education)+type, data=duncan)
print(bcgam.fit)
summary(bcgam.fit)
plot(bcgam.fit, prestige, col=4)
persp(bcgam.fit, prestige, education, level=0.90)


## Example 2 (poisson)
set.seed(2018)
n<-50
x1<-sqrt(1:n)
z<-as.factor(rbinom(n, 1, 0.5))
log.eta<-x1/7+0.2*as.numeric(z)+rnorm(50, sd=0.6)
eta<-exp(log.eta)
y<-rpois(n,eta)

bcgam.fit <- bcgam(y~sm.conv(x1)+z, family="poisson")
summary(bcgam.fit)
predict(bcgam.fit, newdata=data.frame(x1=0.2, z="0"), interval="credible")
plot(bcgam.fit, x1, col=3, col.inter=4)

## End(Not run)

[Package bcgam version 1.0 Index]