R: Overdispersed binomial logit models

glm.binomial.disp {dispmod}

R Documentation

Overdispersed binomial logit models

Description

This function estimates overdispersed binomial logit models using the approach discussed by Williams (1982).

Usage

glm.binomial.disp(object, maxit = 30, verbose = TRUE)

Arguments

`object`	an object of class `"glm"` providing a fitted binomial logistic regression model; see `glm`.
`maxit`	integer giving the maximal number of iterations for the model fitting procedure.
`verbose`	logical, if `TRUE` information are printed during each step of the algorithm.

Details

Extra-binomial variation in logistic linear models is discussed, among others, in Collett (1991). Williams (1982) proposed a quasi-likelihood approach for handling overdispersion in logistic regression models.

Suppose we observe the number of successes y_i in m_i trials, for i = 1, \ldots, n, such that

y_i \mid p_i \sim \mathrm{Binomial}(m_i, p_i)

p_i \sim \mathrm{Beta}(\gamma, \delta)

Under this model, each of the n binomial observations has a different probability of success p_i, where p_i is a random draw from a Beta distribution. Thus,

E(p_i) = \frac{\gamma}{\gamma+\delta} = \theta

V(p_i) = \phi\theta(1-\theta)

Assuming \gamma > 1 and \delta > 1, the Beta density is zero at the extreme values of zero and one, and thus 0 < \phi \le 1/3. From this, the unconditional mean and variance can be calculated:

E(y_i) = m_i \theta

V(y_i) = m_i \theta (1-\theta)(1+(m_i-1)\phi)

so unless m_i = 1 or \phi = 0, the unconditional variance of y_i is larger than binomial variance.

Identical expressions for the mean and variance of y_i can be obtained if we assume that the m_i counts on the i-th unit are dependent, with the same correlation \phi. In this case, -1/(m_i - 1) < \phi \le 1.

The method proposed by Williams uses an iterative algorithm for estimating the dispersion parameter \phi and hence the necessary weights 1/(1 + \phi(m_i - 1)) (for details see Williams, 1982).

Value

The function returns an object of class "glm" with the usual information and the added components:

`dispersion`	the estimated dispersion parameter.
`disp.weights`	the final weights used to fit the model.

Note

Based on a similar procedure available in Arc (Cook and Weisberg, http://www.stat.umn.edu/arc)

References

Collett, D. (1991), Modelling Binary Data, London: Chapman and Hall.

Williams, D. A. (1982), Extra-binomial variation in logistic linear models, Applied Statistics, 31, 144–148.

Examples

data(orobanche)

mod <- glm(cbind(germinated, seeds-germinated) ~ host*variety, data = orobanche,
           family = binomial(logit))
summary(mod)

mod.disp <- glm.binomial.disp(mod)
summary(mod.disp)
mod.disp$dispersion

[Package dispmod version 1.2 Index]