lm.disp {dispmod}R Documentation

Gaussian dispersion models

Description

This function estimates Gaussian dispersion regression models.

Usage

lm.disp(formula, var.formula, data = list(), maxit = 30, 
        epsilon = glm.control()$epsilon, subset, na.action = na.omit, 
        contrasts = NULL, offset = NULL)

Arguments

formula

a symbolic description of the mean function of the model to be fit. For the details of model formula specification see lm and formula.

var.formula

a symbolic description of the variance function of the model to be fit. This must be a one-sided formula; if omitted the same terms used for the mean function are used. For the details of model formula specification see lm and formula.

data

an optional data frame containing the variables in the model. By default the variables are taken from environment(formula), typically the environment from which the function is called.

maxit

integer giving the maximal number of iterations for the model fitting procedure.

epsilon

tolerance value for checking convergence. See glm.control.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NA's. By default is set to na.omit, but other possibilities are available; see na.omit.

contrasts

an optional list as described in the contrasts.arg argument of model.matrix.default.

offset

this can be used to specify an a priori known component to be included in the linear predictor during fitting. An offset term can be included in the formula instead or as well, and if both are specified their sum is used.

Details

Gaussian dispersion models allow to model variance heterogeneity in Gaussian regression analysis using a log-linear model for the variance.

Suppose a response y is modelled as a function of a set of p predictors x through the linear model

y_i = \beta'x_i + e_i

where e_i \sim N(0, \sigma^2) under homogeneity.

Variance heterogeneity is modelled as

V(e_i) = \sigma^2 = \exp(\lambda'z_i)

where z_i may contain some or all the variables in x_i and other variables not included in x_i; z_i is however assumed to contain a constant term.

The full model can be expressed as

E(y|x) = \beta'x

V(y|x) = \exp(\lambda'z)

and it is fitted by maximum likelihood following the algorithm described in Aitkin (1987).

Value

lm.dispmod() returns an object of class "dispmod".

The summary method can be used to obtain and print a summary of the results.

An object of class "dispmod" is a list containing the following components:

call

the matched call.

mean

an object of class "glm" giving the fitted model for the mean function; see glm

var

an object of class "glm" giving the fitted model for the variance function; see glm.

initial.deviance

the value of the deviance at the beginning of the iterative procedure, i.e. assuming constant variance.

deviance

the value of the deviance at the end of the iterative procedure.

Note

Based on a similar procedure available in Arc (Cook and Weisberg, http://www.stat.umn.edu/arc)

References

Aitkin, M. (1987), Modelling variance heterogeneity in normal regression models using GLIM, Applied Statistics, 36, 332–339.

See Also

lm, glm, glm.binomial.disp, glm.poisson.disp, ncvTest.

Examples

data(minitab)
minitab <- within(minitab, y <- V^(1/3) )
mod <- lm(y ~ H + D, data = minitab)
summary(mod)

mod.disp1 <- lm.disp(y ~ H + D, data = minitab)
summary(mod.disp1)

mod.disp2 <- lm.disp(y ~ H + D, ~ H, data = minitab)
summary(mod.disp2)

# Likelihood ratio test
deviances <- c(mod.disp1$initial.deviance,
               mod.disp2$deviance, 
               mod.disp1$deviance)
lrt <- c(NA, abs(diff(deviances)))
cbind(deviances, lrt, p.value = 1-pchisq(lrt, 1))

# quadratic dispersion model on D (as discussed by Aitkin)
mod.disp4 <- lm.disp(y ~ H + D, ~ D + I(D^2), data = minitab)
summary(mod.disp4)

r <- mod$residuals
phi.est <- mod.disp4$var$fitted.values
plot(minitab$D, log(r^2))
lines(minitab$D, log(phi.est))

[Package dispmod version 1.2 Index]