R: Overdispersed Poisson log-linear models

glm.poisson.disp {dispmod}

R Documentation

Overdispersed Poisson log-linear models

Description

This function estimates overdispersed Poisson log-linear models using the approach discussed by Breslow N.E. (1984).

Usage

glm.poisson.disp(object, maxit = 30, verbose = TRUE)

Arguments

`object`	an object of class `"glm"` providing a fitted Poisson log-linear regression model; see `glm`.
`maxit`	integer giving the maximal number of iterations for the model fitting procedure.
`verbose`	logical, if `TRUE` information are printed during each step of the algorithm.

Details

Breslow (1984) proposed an iterative algorithm for fitting overdispersed Poisson log-linear models. The method is similar to that proposed by Williams (1982) for handling overdispersion in logistic regression models (see glm.binomial.disp).

Suppose we observe n independent responses such that

y_i \mid \lambda_i \sim \mathrm{Poisson}(\lambda_in_i)

for i = 1, \ldots, n. The response variable y_i may be an event counts variable observed over a period of time (or in the space) of length n_i, whereas \lambda_i is the rate parameter. Then,

E(y_i \mid \lambda_i) = \mu_i = \lambda_i n_i = \exp(\log(n_i) + \log(\lambda_i))

where \log(n_i) is an offset and \log(\lambda_i) = \beta'x_i expresses the dependence of the Poisson rate parameter on a set of, say p, predictors. If the periods of time are all of the same length, we can set n_i = 1 for all i so the offset is zero.

The Poisson distribution has E(y_i \mid \lambda_i) = V(y_i \mid \lambda_i), but it may happen that the actual variance exceeds the nominal variance under the assumed probability model.

Suppose that \theta_i = \lambda_i n_i is a random variable distributed according to

\theta_i \sim \mathrm{Gamma} (\mu_i, 1/\phi)

where E(\theta_i) = \mu_i and V(\theta_i) = \mu_i^2 \phi. Thus, it can be shown that the unconditional mean and variance of y_i are given by

E(y_i) = \mu_i

and

V(y_i) = \mu_i + \mu_i^2 \phi = \mu_i (1 + \mu_i\phi)

Hence, for \phi > 0 we have overdispersion. It is interesting to note that the same mean and variance arise also if we assume a negative binomial distribution for the response variable.

The method proposed by Breslow uses an iterative algorithm for estimating the dispersion parameter \phi and hence the necessary weights 1/(1 + \mu_i \hat{\phi}) (for details see Breslow, 1984).

Value

The function returns an object of class "glm" with the usual information and the added components:

`dispersion`	the estimated dispersion parameter.
`disp.weights`	the final weights used to fit the model.

Note

Based on a similar procedure available in Arc (Cook and Weisberg, http://www.stat.umn.edu/arc)

References

Breslow, N.E. (1984), Extra-Poisson variation in log-linear models, Applied Statistics, 33, 38–44.

Examples

## Salmonella TA98 data
data(salmonellaTA98)
salmonellaTA98 <- within(salmonellaTA98, logx10 <- log(x+10))
mod <- glm(y ~ logx10 + x, data = salmonellaTA98, family = poisson(log)) 
summary(mod)

mod.disp <- glm.poisson.disp(mod)
summary(mod.disp)
mod.disp$dispersion

# compute predictions on a grid of x-values...
x0 <- with(salmonellaTA98, seq(min(x), max(x), length=50))
eta0 <- predict(mod, newdata = data.frame(logx10 = log(x0+10), x = x0), se=TRUE)
eta0.disp <- predict(mod.disp, newdata = data.frame(logx10 = log(x0+10), x = x0), se=TRUE)
# ... and plot the mean functions with variability bands
plot(y ~ x, data = salmonellaTA98)
lines(x0, exp(eta0$fit))
lines(x0, exp(eta0$fit+2*eta0$se), lty=2)
lines(x0, exp(eta0$fit-2*eta0$se), lty=2)
lines(x0, exp(eta0.disp$fit), col=3)
lines(x0, exp(eta0.disp$fit+2*eta0.disp$se), lty=2, col=3)
lines(x0, exp(eta0.disp$fit-2*eta0.disp$se), lty=2, col=3)

## Holford's data
data(holford)

mod <- glm(incid ~ offset(log(pop)) + Age + Cohort, data = holford, 
           family = poisson(log)) 
summary(mod)

mod.disp <- glm.poisson.disp(mod)
summary(mod.disp)
mod.disp$dispersion

[Package dispmod version 1.2 Index]