amlps {blapsr} | R Documentation |
Bayesian additive partial linear modeling with Laplace-P-splines.
Description
Fits an additive partial linear model to data using an approximate Bayesian inference technique based on penalized regression splines and Laplace approximations. Smooth additive terms are specified as a linear combination of of a large number of cubic B-splines. To counterbalance the roughness of the fit, a discrete penalty on neighboring spline coefficients is imposed in the spirit of Eilers and Marx (1996). The error of the model is assumed to be Gaussian with zero mean and finite variance.
The optimal amount of smoothing is determined by a grid-based exploration of the posterior penalty space when the number of smooth terms is small to moderate. When the dimension of the penalty space is large, the optimal smoothing parameter is chosen to be the value that maximizes the (log-)posterior of the penalty vector.
Usage
amlps(formula, data, K = 30, penorder = 2, cred.int = 0.95)
Arguments
formula |
A formula object where the ~ operator separates the response
from the covariates of the linear part |
data |
Optional. A data frame to match the variable names provided in formula. |
K |
A positive integer specifying the number of cubic B-spline
functions in the basis used to model the smooth terms. Default is
|
penorder |
The penalty order used on finite differences of the coefficients of contiguous B-splines. Can be either 2 for a second-order penalty (the default) or 3 for a third-order penalty. |
cred.int |
The level of the pointwise credible interval to be computed for the coefficients in the linear part of the model. |
Details
The B-spline basis used to approximate a smooth additive component
is computed with the function cubicbs
. The lower (upper)
bound of the B-spline basis is taken to be the minimum (maximum) value of
the covariate associated to the smooth. For identifiability
purposes, the B-spline matrices (computed over the observed covariates)
are centered. The centering consists is subtracting from each column of the
B-spline matrix, the corresponding column average of another B-spline matrix
computed on a fine grid of equidistant values in the domain of the smooth
term.
A hierarchical Gamma prior is imposed on the roughness penalty vector and Jeffreys' prior is imposed on the precision of the error. A Newton-Raphson algorithm is used to compute the posterior mode of the (log-)posterior penalty vector. The latter algorithm uses analytically derived versions of the gradient and Hessian. When the number of smooth terms in the model is smaller or equal to 4, a grid-based strategy is used for posterior exploration of the penalty space. Above that threshold, the optimal amount of smoothness is determined by the posterior maximum value of the penalty vector. This strategy allows to keep the computational burden to fit the model relatively low and to conserve good statistical performance.
Value
An object of class amlps
containing several components from
the fit. Details can be found in amlps.object
. Details on
the output printed by amlps
can be found in
print.amlps
. Fitted smooth terms can be visualized with the
plot.amlps
routine.
Author(s)
Oswaldo Gressani oswaldo_gressani@hotmail.fr.
References
Eilers, P.H.C. and Marx, B.D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11(2): 89-121.
Fan, Y. and Li, Q. (2003). A kernel-based method for estimating additive partially linear models. Statistica Sinica, 13(3): 739-762.
Gressani, O. and Lambert, P. (2018). Fast Bayesian inference using Laplace approximations in a flexible promotion time cure model based on P-splines. Computational Statistical & Data Analysis 124: 151-167.
Opsomer, J. D. and Ruppert, D. (1999). A root-n consistent backfitting estimator for semiparametric additive modeling. Journal of Computational and Graphical Statistics, 8(4): 715-732.
See Also
cubicbs
, amlps.object
,
print.amlps
, plot.amlps
Examples
### Classic simulated data example (with simgamdata)
set.seed(17)
sim.data <- simgamdata(setting = 2, n = 200, dist = "gaussian", scale = 0.4)
data <- sim.data$data # Simulated data frame
# Fit model
fit <- amlps(y ~ z1 + z2 + sm(x1) + sm(x2), data = data, K = 15)
fit