mgcv.package {mgcv} | R Documentation |

## Mixed GAM Computation Vehicle with GCV/AIC/REML/NCV smoothness estimation and GAMMs by REML/PQL

### Description

`mgcv`

provides functions for generalized additive modelling (`gam`

and `bam`

) and
generalized additive mixed modelling (`gamm`

, and `random.effects`

). The term GAM is taken to include
any model dependent on unknown smooth functions of predictors and estimated by quadratically penalized (possibly quasi-) likelihood maximization. Available distributions are covered in `family.mgcv`

and available smooths in `smooth.terms`

.

Particular features of the package are facilities for automatic smoothness selection (Wood, 2004, 2011),
and the provision of a variety of smooths of more than one variable. User defined
smooths can be added. A Bayesian approach to confidence/credible interval calculation is
provided. Linear functionals of smooths, penalization of parametric model terms and linkage
of smoothing parameters are all supported. Lower level routines for generalized ridge
regression and penalized linearly constrained least squares are also available. In addition to the main modelling functions, `jagam`

provided facilities to ease the set up of models for use with JAGS, while `ginla`

provides marginal inference via a version of Integrated Nested Laplace Approximation.

### Details

`mgcv`

provides generalized additive modelling functions `gam`

,
`predict.gam`

and `plot.gam`

, which are very similar
in use to the S functions of the same name designed by Trevor Hastie (with some extensions).
However the underlying representation and estimation of the models is based on a
penalized regression spline approach, with automatic smoothness selection. A
number of other functions such as `summary.gam`

and `anova.gam`

are also provided, for extracting information from a fitted `gamObject`

.

Use of `gam`

is much like use of `glm`

, except that
within a `gam`

model formula, isotropic smooths of any number of predictors can be specified using
`s`

terms, while scale invariant smooths of any number of
predictors can be specified using `te`

, `ti`

or `t2`

terms.
`smooth.terms`

provides an
overview of the built in smooth classes, and `random.effects`

should be refered to for an overview
of random effects terms (see also `mrf`

for Markov random fields). Estimation is by
penalized likelihood or quasi-likelihood maximization, with smoothness
selection by GCV, GACV, gAIC/UBRE, `NCV`

or (RE)ML. See `gam`

, `gam.models`

,
`linear.functional.terms`

and `gam.selection`

for some discussion of model specification and
selection. For detailed control of fitting see `gam.convergence`

,
`gam`

arguments `method`

and `optimizer`

and `gam.control`

. For checking and
visualization see `gam.check`

, `choose.k`

, `vis.gam`

and `plot.gam`

.
While a number of types of smoother are built into the package, it is also
extendable with user defined smooths, see `smooth.construct`

, for example.

A Bayesian approach to smooth modelling is used to derive standard errors on
predictions, and hence credible intervals (see Marra and Wood, 2012). The Bayesian covariance matrix for
the model coefficients is returned in `Vp`

of the
`gamObject`

. See `predict.gam`

for examples of how
this can be used to obtain credible regions for any quantity derived from the
fitted model, either directly, or by direct simulation from the posterior
distribution of the model coefficients. Approximate p-values can also be obtained for testing
individual smooth terms for equality to the zero function, using similar ideas (see Wood, 2013a,b). Frequentist
approximations can be used for hypothesis testing based model comparison. See `anova.gam`

and
`summary.gam`

for more on hypothesis testing.

For large datasets (that is large n) see `bam`

which is a version of `gam`

with
a much reduced memory footprint. `bam(...,discrete=TRUE)`

offers the very efficient methods of Wood et al. (2017) and Li and Wood (2020).

The package also provides a generalized additive mixed modelling function,
`gamm`

, based on a PQL approach and
`lme`

from the `nlme`

library (for an `lme4`

based version, see package `gamm4`

).
`gamm`

is particularly useful
for modelling correlated data (i.e. where a simple independence model for the
residual variation is inappropriate). In addition, low level routine `magic`

can fit models to data with a known correlation structure.

Some underlying GAM fitting methods are available as low level fitting
functions: see `magic`

. But there is little functionality
that can not be more conventiently accessed via `gam`

.
Penalized weighted least squares with linear equality and inequality constraints is provided by
`pcls`

.

For a complete list of functions type `library(help=mgcv)`

. See also `mgcv.FAQ`

.

### Author(s)

Simon Wood <simon.wood@r-project.org>

with contributions and/or help from Natalya Pya, Thomas Kneib, Kurt Hornik, Mike Lonergan, Henric Nilsson, Fabian Scheipl and Brian Ripley.

Polish translation - Lukasz Daniel; German translation - Chris Leick, Detlef Steuer; French Translation - Philippe Grosjean

Maintainer: Simon Wood <simon.wood@r-project.org>

Part funded by EPSRC: EP/K005251/1

### References

These provide details for the underlying mgcv methods, and fuller references to the large literature on which the methods are based.

Wood, S. N. (2020) Inference and computation with generalized additive models and their extensions. Test 29(2): 307-339. doi:10.1007/s11749-020-00711-5

Wood, S.N., N. Pya and B. Saefken (2016), Smoothing parameter and model selection for general smooth models (with discussion). Journal of the American Statistical Association 111, 1548-1575 doi:10.1080/01621459.2016.1180986

Wood, S.N. (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society (B) 73(1):3-36

Wood, S.N. (2004) Stable and efficient multiple smoothing parameter estimation for generalized additive models. J. Amer. Statist. Ass. 99:673-686.

Marra, G and S.N. Wood (2012) Coverage Properties of Confidence Intervals for Generalized Additive Model Components. Scandinavian Journal of Statistics, 39(1), 53-74.

Wood, S.N. (2013a) A simple test for random effects in regression models. Biometrika 100:1005-1010 doi:10.1093/biomet/ast038

Wood, S.N. (2013b) On p-values for smooth components of an extended generalized additive model. Biometrika 100:221-228 doi:10.1093/biomet/ass048

Wood, S.N. (2017) *Generalized Additive Models: an introduction with R (2nd edition)*,
CRC doi:10.1201/9781315370279

Wood, S.N., Li, Z., Shaddick, G. & Augustin N.H. (2017) Generalized additive models for gigadata: modelling the UK black smoke network daily data. Journal of the American Statistical Association. 112(519):1199-1210 doi:10.1080/01621459.2016.1195744

Li, Z & S.N. Wood (2020) Faster model matrix crossproducts for large generalized linear models with discretized covariates. Statistics and Computing. 30:19-25 doi:10.1007/s11222-019-09864-2

Development of mgcv version 1.8 was part funded by EPSRC grants EP/K005251/1 and EP/I000917/1.

### Examples

```
## see examples for gam, bam and gamm
```

*mgcv*version 1.9-1 Index]