gamlps {blapsr} | R Documentation |
Bayesian generalized additive modeling with Laplace-P-splines.
Description
Fits a generalized additive model (GAM) to data using an approximate Bayesian inference technique based on penalized regression splines and Laplace approximations. Smooth additive terms are specified as a linear combination of a large number of cubic B-splines. To counterbalance the roughness of the fit, a discrete penalty on neighboring spline coefficients is imposed in the spirit of Eilers and Marx (1996). The effective degrees of freedom of the smooth terms are also estimated.
The optimal amount of smoothing is determined by a grid-based exploration of the posterior penalty space when the number of smooth terms is small to moderate. When the dimension of the penalty space is large, the optimal smoothing parameter is chosen to be the value that maximizes the (log-)posterior of the penalty vector. Approximate Bayesian credible intervals for latent model variables and functions of latent model variables are available.
Usage
gamlps(formula, data, K = 30, family = c("gaussian", "poisson", "bernoulli", "binomial"),
gauss.scale, nbinom, penorder = 2, cred.int = 0.95)
Arguments
formula |
A formula object where the ~ operator separates the response
from the covariates of the linear part |
data |
Optional. A data frame to match the variables names provided in formula. |
K |
A positive integer specifying the number of cubic B-spline
functions in the basis used to model the smooth terms. Default is
|
family |
The error distribution of the model. It is a character string
that must partially match either |
gauss.scale |
The scale parameter to be specified when
|
nbinom |
The number of experiments in the Binomial family. |
penorder |
The penalty order used on finite differences of the coefficients of contiguous B-splines. Can be either 2 for a second-order penalty (the default) or 3 for a third-order penalty. |
cred.int |
The level of the pointwise credible interval to be computed for the coefficients in the linear part of the model. |
Details
The B-spline basis used to approximate a smooth additive component
is computed with the function cubicbs
. The lower (upper)
bound of the B-spline basis is taken to be the minimum (maximum) value of
the covariate associated to the smooth. For identifiability
purposes, the B-spline matrices (computed over the observed covariates)
are centered. The centering consists is subtracting from each column of the
B-spline matrix, the corresponding column average of another B-spline matrix
computed on a fine grid of equidistant values in the domain of the smooth
term.
A hierarchical Gamma prior is imposed on the roughness penalty vector. A Newton-Raphson algorithm is used to compute the posterior mode of the (log-)posterior penalty vector. The latter algorithm uses analytically derived versions of the gradient and Hessian. When the number of smooth terms in the model is smaller or equal to 4, a grid-based strategy is used for posterior exploration of the penalty space. Above that threshold, the optimal amount of smoothness is determined by the posterior maximum value of the penalty vector. This strategy allows to keep the computational burden to fit the model relatively low and conserve good statistical performance.
Value
An object of class gamlps
containing several components from
the fit. Details can be found in gamlps.object
. Details on
the output printed by gamlps
can be found in
print.gamlps
. Fitted smooth terms can be visualized with the
plot.gamlps
routine.
Author(s)
Oswaldo Gressani oswaldo_gressani@hotmail.fr.
References
Hastie, T. J. and Tibshirani., RJ (1990). Generalized additive models. Monographs on statistics and applied probability, 43, 335.
Eilers, P.H.C. and Marx, B.D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11(2): 89-121.
Gressani, O. and Lambert, P. (2018). Fast Bayesian inference using Laplace approximations in a flexible promotion time cure model based on P-splines. Computational Statistical & Data Analysis 124: 151-167.
See Also
cubicbs
, gamlps.object
,
print.gamlps
, plot.gamlps
Examples
set.seed(14)
sim <- simgamdata(n = 300, setting = 2, dist = "binomial", scale = 0.25)
dat <- sim$data
fit <- gamlps(y ~ z1 + z2 + sm(x1) + sm(x2), K = 15, data = dat,
penorder = 2, family = "binomial", nbinom = 15)
fit
# Check fit and compare with target (in red)
opar <- par(no.readonly = TRUE)
par(mfrow = c(1, 2))
domx <- seq(-1, 1 ,length = 200)
plot(fit, smoo.index = 1, cred.int = 0.95, ylim = c(-2, 2))
lines(domx, sim$f[[1]](domx), type= "l", lty = 2, lwd = 2, col = "red")
plot(fit, smoo.index = 2, cred.int = 0.95, ylim = c(-3, 3))
lines(domx, sim$f[[2]](domx), type= "l", lty = 2, lwd = 2, col = "red")
par(opar)