bcgam {bcgam}R Documentation

Fitting Bayesian Constrained Generalised Additive Models

Description

bcgam is used to fit generalised partial linear regression models using a Bayesian approach, where shape and smoothness constraints are imposed on nonparametrically modelled predictors through shape-restricted splines, and no constraints are imposed on optional parametrically modelled covariates.

Usage

bcgam(formula, family = gaussian(), data = NULL, nloop = 10000,
  burnin = trunc(nloop/10))

Arguments

formula

an object of class formula that contains a symbolic description of the model to be fitted. It has the form "response~nonparam+param", where "nonparam" are the predictors to be modelled nonparametrically and "param" are the optional predictors to be modelled parametrically. The user has to specify the relationship between the systematic component \eta and any nonparametrically modelled predictor x. The options are:

  • sm.incr(x): x is smooth and increasing in \eta. See sm.incr for more details.

  • sm.decr(x): x is smooth and decreasing in \eta. See sm.decr for more details.

  • sm.conv(x): x is smooth and convex in \eta. See sm.conv for more details.

  • sm.conc(x): x is smooth and concave in \eta. See sm.conc for more details.

  • sm.incr.conv(x): x is smooth, increasing and convex in \eta. See sm.incr.conv for more details.

  • sm.decr.conv(x): x is smooth, decreasing and convex in \eta. See sm.decr.conv for more details.

  • sm.incr.conc(x): x is smooth, increasing and concave in \eta. See sm.incr.conc for more details.

  • sm.decr.conc(x): x is smooth, decreasing and concave in \eta. See sm.decr.conc for more details.

family

a description of the error distribution and link function to be used in the model. This accepts only the following families: "gaussian" (normal errors model), "binomial" (logistic model), and "poisson" (Poisson model). See family for details of family functions.

data

an optional data frame, list or environment containing the variables in the model. The default is "NULL".

nloop

length of the MCMC. The default is 10000.

burnin

a positive value, smaller than nloop, that indicates the amount of initial MCMC values to be discarded. By default, it burns-in the first 10% chain values.

Details

We assume the additive model for each systematic component element \eta_i given by

\eta_i = f_1(x_{1i}) + ... + f_L(x_{Li}) + z_i'\gamma,

where z_i is a vector of variables to be modelled parametrically and \gamma is a parameter vector. The functions f_l of the continuous predictors x_l are assumed to be smooth, and shape restrictions such as monotonicity and/or convexity might be assumed. Generally, the vector \eta=(\eta_1, ..., \eta_n)' is approximated by

\sum_{j=1}^{m_1}\beta_{1j}\delta_{1j} + ... + \sum_{j=1}^{m_L}\beta_{Lj}\delta_{Lj} + \sum_{j=1}^{p}\alpha_j\nu_j,

where \beta_{lj} \ge 0 for all l,j. The \delta's represent the basis vectors used to approximate the f functions. The \nu_j consists of the one vector and the vectors of the observed values of covariates to be modelled parametrically. In addition, when f_l is assumed to be convex, the x_l vector is included as one of the \nu_j.

A Bayesian approach is considered for estimation and inference of the model above. As the \beta coefficients are constrained to be non-negative, then a gamma prior with hyperparameters c_{l1} (shape) and c_{l2} (scale) is assumed for each \beta_{lj}. The values c_{l1} and c_{l2} are chosen in a way that a large variance can be combined with a small mean, so that it is close to a non-informative gamma prior. Further, a normal prior distribution with mean zero and large variance M is considered for the \alpha coefficients.

bcgam makes use of the system "nimble" to set the Bayesian (hierarchical) model and compute the MCMC. Hence, "nimble" has to be loaded in R to be able to use bcgam. Information about how to download and install "nimble" can be found at https://r-nimble.org.

Value

bcgam returns an object of class "bcgam".

The generic routines summary and print are used to obtain and print a summary of the results. Further, 2D and 3D plots can be created using plot and persp, respectively.

An object of class "bcgam" is a list containing at least the following components:

coefs

a vector of posterior means of the \alpha and \beta coefficients.

sd.coefs

a vector of posterior standard errors of the \alpha and \beta coefficients.

etahat

a vector of posterior means of the systematic component \eta.

muhat

a vector of posterior means of \mu. \mu is obtained by transforming \eta using the inverse of the link function.

alpha.sims

a matrix of posterior samples (after burn-in) of the \alpha coefficients.

beta.sims

a matrix of posterior samples (after burn-in) of the \beta coefficients.

sigma.sims

a matrix of posterior samples (after burn-in) of \sigma. This is only shown when family="gaussian".

eta.sims

a matrix of posterior samples (after burn-in) of the systematic component \eta.

mu.sims

a matrix of posterior samples (after burn-in) of \mu. \mu is obtained by transforming \eta using the inverse of the link function.

delta

a matrix that contains the basis functions \delta in its columns.

zmat

a matrix that contains the vectors \nu in its columns.

knots

a list of the knots.

shapes

a list of numbers that indicate the shape categories.

sps

a character vector of the space parameter used to create the knots.

nloop

the length of the MCMC.

burnin

the burn-in value.

family

the family parameter.

y

the response variable.

Author(s)

Cristian Oliva-Aviles and Mary C. Meyer

References

Meyer, M. C. (2008) Inference using shape-restricted regression splines. Annals of Applied Statistics 2(3), 1013-1033.

Meyer, M. C., Hackstadt, A. J., and Hoeting J. A. (2011) Bayesian estimation and inference for generalised partial linear models using shape-restricted splines. Journal of Nonparametric Statistics 23(4), 867-884.

See Also

predict.bcgam

Examples

## Not run: 
## Example 1 (gaussian)
data(duncan)

bcgam.fit <- bcgam(income~sm.incr(prestige, space="E")+sm.conv(education)+type, data=duncan)
print(bcgam.fit)
summary(bcgam.fit)
plot(bcgam.fit, prestige, col=4)
persp(bcgam.fit, prestige, education, level=0.90)


## Example 2 (poisson)
set.seed(2018)
n<-50
x1<-sqrt(1:n)
z<-as.factor(rbinom(n, 1, 0.5))
log.eta<-x1/7+0.2*as.numeric(z)+rnorm(50, sd=0.6)
eta<-exp(log.eta)
y<-rpois(n,eta)

bcgam.fit <- bcgam(y~sm.conv(x1)+z, family="poisson")
summary(bcgam.fit)
predict(bcgam.fit, newdata=data.frame(x1=0.2, z="0"), interval="credible")
plot(bcgam.fit, x1, col=3, col.inter=4)

## End(Not run)

[Package bcgam version 1.0 Index]