R: Gaussian dispersion models

lm.disp {dispmod}

R Documentation

Gaussian dispersion models

Description

This function estimates Gaussian dispersion regression models.

Usage

lm.disp(formula, var.formula, data = list(), maxit = 30, 
        epsilon = glm.control()$epsilon, subset, na.action = na.omit, 
        contrasts = NULL, offset = NULL)

Arguments

`formula`	a symbolic description of the mean function of the model to be fit. For the details of model formula specification see `lm` and `formula`.
`var.formula`	a symbolic description of the variance function of the model to be fit. This must be a one-sided formula; if omitted the same terms used for the mean function are used. For the details of model formula specification see `lm` and `formula`.
`data`	an optional data frame containing the variables in the model. By default the variables are taken from `environment(formula)`, typically the environment from which the function is called.
`maxit`	integer giving the maximal number of iterations for the model fitting procedure.
`epsilon`	tolerance value for checking convergence. See `glm.control`.
`subset`	an optional vector specifying a subset of observations to be used in the fitting process.
`na.action`	a function which indicates what should happen when the data contain `NA`'s. By default is set to `na.omit`, but other possibilities are available; see `na.omit`.
`contrasts`	an optional list as described in the `contrasts.arg` argument of `model.matrix.default`.
`offset`	this can be used to specify an a priori known component to be included in the linear predictor during fitting. An `offset` term can be included in the formula instead or as well, and if both are specified their sum is used.

Details

Gaussian dispersion models allow to model variance heterogeneity in Gaussian regression analysis using a log-linear model for the variance.

Suppose a response y is modelled as a function of a set of p predictors x through the linear model

y_i = \beta'x_i + e_i

where e_i \sim N(0, \sigma^2) under homogeneity.

Variance heterogeneity is modelled as

V(e_i) = \sigma^2 = \exp(\lambda'z_i)

where z_i may contain some or all the variables in x_i and other variables not included in x_i; z_i is however assumed to contain a constant term.

The full model can be expressed as

E(y|x) = \beta'x

V(y|x) = \exp(\lambda'z)

and it is fitted by maximum likelihood following the algorithm described in Aitkin (1987).

Value

lm.dispmod() returns an object of class "dispmod".

The summary method can be used to obtain and print a summary of the results.

An object of class "dispmod" is a list containing the following components:

`call`	the matched call.
`mean`	an object of class `"glm"` giving the fitted model for the mean function; see `glm`
`var`	an object of class `"glm"` giving the fitted model for the variance function; see `glm`.
`initial.deviance`	the value of the deviance at the beginning of the iterative procedure, i.e. assuming constant variance.
`deviance`	the value of the deviance at the end of the iterative procedure.

Note

Based on a similar procedure available in Arc (Cook and Weisberg, http://www.stat.umn.edu/arc)

References

Aitkin, M. (1987), Modelling variance heterogeneity in normal regression models using GLIM, Applied Statistics, 36, 332–339.

Examples

data(minitab)
minitab <- within(minitab, y <- V^(1/3) )
mod <- lm(y ~ H + D, data = minitab)
summary(mod)

mod.disp1 <- lm.disp(y ~ H + D, data = minitab)
summary(mod.disp1)

mod.disp2 <- lm.disp(y ~ H + D, ~ H, data = minitab)
summary(mod.disp2)

# Likelihood ratio test
deviances <- c(mod.disp1$initial.deviance,
               mod.disp2$deviance, 
               mod.disp1$deviance)
lrt <- c(NA, abs(diff(deviances)))
cbind(deviances, lrt, p.value = 1-pchisq(lrt, 1))

# quadratic dispersion model on D (as discussed by Aitkin)
mod.disp4 <- lm.disp(y ~ H + D, ~ D + I(D^2), data = minitab)
summary(mod.disp4)

r <- mod$residuals
phi.est <- mod.disp4$var$fitted.values
plot(minitab$D, log(r^2))
lines(minitab$D, log(phi.est))

[Package dispmod version 1.2 Index]