R: Double Generalized Linear Model

dglm-class {dglm}

R Documentation

Double Generalized Linear Model - class

Description

Class of objects returned by fitting double generalized linear models.

Details

Write \mu_i = \mbox{E}[y_i] for the expectation of the ith response. Then \mbox{Var}[Y_i] = \phi_i V(\mu_i) where V is the variance function and \phi_i is the dispersion of the ith response (often denoted as the Greek character ‘phi’). We assume the link linear models g(\mu_i) = \mathbf{x}_i^T \mathbf{b} and h(\phi_i) = \mathbf{z}_i^T \mathbf{z}, where \mathbf{x}_i and \mathbf{z}_i are vectors of covariates, and \mathbf{b} and \mathbf{a} are vectors of regression cofficients affecting the mean and dispersion respectively. The argument dlink specifies h. See family for how to specify g. The optional arguments mustart, betastart and phistart specify starting values for \mu_i, \mathbf{b} and \phi_i respectively.

The parameters \mathbf{b} are estimated as for an ordinary glm. The parameters \mathbf{a} are estimated by way of a dual glm in which the deviance components of the ordinary glm appear as responses. The estimation procedure alternates between one iteration for the mean submodel and one iteration for the dispersion submodel until overall convergence.

The output from dglm, out say, consists of two glm objects (that for the dispersion submodel is out$dispersion.fit) with a few more components for the outer iteration and overall likelihood. The summary and anova functions have special methods for dglm objects. Any generic function that has methods for glms or lms will work on out, giving information about the mean submodel. Information about the dispersion submodel can be obtained by using out$dispersion.fit as argument rather than out itself. In particular drop1(out,scale=1) gives correct score statistics for removing terms from the mean submodel, while drop1(out$dispersion.fit,scale=2) gives correct score statistics for removing terms from the dispersion submodel.

The dispersion submodel is treated as a gamma family unless the original reponses are gamma, in which case the dispersion submodel is digamma. This is exact if the original glm family is gaussian, Gamma or inverse.gaussian. In other cases it can be justified by the saddle-point approximation to the density of the responses. The results will therefore be close to exact ML or REML when the dispersions are small compared to the means. In all cases the dispersion submodel has prior weights 1, and has its own dispersion parameter which is 2.

Generation

This class of objects is returned by the dglm function to represent a fitted double generalized linear model. Class "dglm" inherits from class "glm", since it consists of two coupled generalized linear models, one for the mean and one for the dispersion. Like glm, it also inherits from lm. The object returned has all the components of a glm object. The returned component object$dispersion.fit is also a glm object in its own right, representing the result of modelling the dispersion.

Methods

Objects of this class have methods for the functions print, plot, summary, anova, predict, fitted, drop1, add1, and step, amongst others. Specific methods (not shared with glm) exist for summary and anova.

Structure

A dglm object consists of a glm object with the following additional components:

`dispersion.fit`	the dispersion submodel: a `glm` object representing the fitted model for the dispersions. The responses for this model are the deviance components from the original generalized linear model. The prior weights are 1 and the dispersion or scale of this model is 2.
`iter`	this component now represents the number of outer iterations used to fit the coupled mean-dispersion models. At each outer iteration, one IRLS is done for each of the mean and dispersion submodels.
`method`	fitting method used: `"ml"` if maximum likelihood was used or `"reml"` if adjusted profile likelihood was used.
`m2loglik`	minus twice the log-likelihood or adjusted profile likelihood of the fitted model.

Note

The anova method is questionable when applied to an dglm object with method="reml" (stick to method="ml").

Author(s)

Gordon Smyth, ported to R by Peter Dunn (pdunn2@usc.edu.au)

References

Smyth, G. K. (1989). Generalized linear models with varying dispersion. J. R. Statist. Soc. B, 51, 47–60. doi:10.1111/j.2517-6161.1989.tb01747.x

Smyth, G. K., and Verbyla, A. P. (1999). Adjusted likelihood methods for modelling dispersion in generalized linear models. Environmetrics, 10, 696-709. doi:10.1002/(SICI)1099-095X(199911/12)10:6<695::AID-ENV385>3.0.CO;2-M https://gksmyth.github.io/pubs/Ties98-Preprint.pdf

Smyth, G. K., and Verbyla, A. P. (1999). Double generalized linear models: approximate REML and diagnostics. In Statistical Modelling: Proceedings of the 14th International Workshop on Statistical Modelling, Graz, Austria, July 19-23, 1999, H. Friedl, A. Berghold, G. Kauermann (eds.), Technical University, Graz, Austria, pages 66-80. https://gksmyth.github.io/pubs/iwsm99-Preprint.pdf