R: The Extended Quasi-Likelihood Function

eql {EQL}

R Documentation

The Extended Quasi-Likelihood Function

Description

Computes the Extended Quasi Likelihood (EQL) function for a given set of variance functions from a particular variance family.

Usage

eql(formula, param.space, family = powerVarianceFamily(),
    phi.method = c("pearson", "mean.dev"), include.model = TRUE,
    smooth.grid = 10, do.smooth = dim(family) == 1,
    verbose = 1, ...)

## S3 method for class 'eql'
plot(x, do.points = (dim(x) == 1 && sum(!x$is.smoothed) <= 20),
     do.ci = TRUE, alpha = 0.95, do.bw = TRUE,
     show.max = TRUE, ...)

Arguments

`formula`	an object of class `formula` (or one that can be coerced to that class): a symbolic description of the model to be used to determine the parameters of the variance function.
`param.space`	a list of parameters for which the EQL value should be evaluated. If provided as a named list, the names must equal the names of the parameters defined by the variance family.
`family`	an object of class `varianceFamily` giving a parameterized family of variance functions. See `varianceFamily` for further details.
`phi.method`	a character string giving the name of the method used to estimate the dispersion parameter `\phi`. Must be one of (“pearson”, “mean.dev”) representing the estimation of `\phi` by the mean Pearson's statistic or by the mean deviance, respectively.
`include.model`	logical. If `TRUE` (the default) the final model is included in the output.
`x`	an object of class `eql`.
`do.smooth`, `smooth.grid`	`do.smooth` is a logical value and `smooth.grid` is an integer value giving the number of nodes for the smoothing process. If `do.smooth` is `TRUE`, smoothing is carried out by cubic splines on an equidistant grid with an amount of nodes equals to `smooth.grid` between two adjacent EQL values. Smoothing is currently only available for one-dimensional variance families, i.e. families that depend only on one parameter.
`verbose`	the amount of feedback requested: ‘0’ or `FALSE` means no feedback, ‘1’ or `TRUE` means some feedback (the default), and ‘2’ means to show all available feedback. For the default setting, a progress bar will be displayed to give a rough estimation of the remaining calculation time. Full feedback prints the EQL value for each parameter combination.
`...`	further arguments to be passed to the `glm` routine and the plotting routine, respectively.
`do.points`, `show.max`	logical. If `do.points` is `TRUE`, the computed EQL values are marked in the plot. If `show.max` is `TRUE`, the maximum of the EQL function is emphasized in the plot.
`do.ci`, `alpha`	`do.ci` is a logical value, if `TRUE` a `\alpha` confidence interval (respectively confidence ellipsoid) is added to the plot.
`do.bw`	logical. If `TRUE` (the default) a “black and white” plot is produced, otherwise colours are used.

Details

The EQL function as defined by Nelder and Pregibon (see ‘References’) is given by:

Q^+_\theta(y,\mu) = -\frac{1}{2} \log [2\pi\phi V_\theta(y)] - \frac{1}{2\phi} D_\theta(y,\mu),

where D_\theta() and V_\theta() denote the deviance function and the variance function, respectively, determined by the particular choice of the variance family.

The goal is to maximize the EQL function over \mu and the not necessarily one-dimensional space of parameters \theta. The function eql takes a particular finite set of candidate parameters and computes the corresponding EQL value for each of these parameters and returns the maximum EQL value for the given set. That implies that the function is only capable of capturing local maxima. If the maximum occurs at the boundary of the set, the set of parameters may be badly chosen and one should consider a larger set with the found maximum as an interior point.

The plot function is an important tool to investigate the structure of the EQL function. Confidence intervals and confidence ellipsoids give an idea of plausible parameter values for the variance function. The contour plot used for two-dimensional variance families is generated using the package lattice, which in turn relies on so called trellis plots. Hence, for two-dimensional families the plot function does not only generate the plot, but also returns the plot object to allow for further modifications of the plot. This is not true for one-dimensional variance models, which are plotted using the R standard graphical engine.

For large parameter sets the computation may take a long time. If no feedback is chosen, the function seems to be hung up, because the function does not provide any textual feedback while computing. Hence, a minimal feedback (including a progress bar) should be chosen to have an idea of the remaining calculation time.

An explicitely given deviance function speeds up calculation. A rather large amount of the total calculation time is used to determine the numerical values of the integral in the deviance function.

Value

eql returns an object of class eql, which contains the following components:

`eql`	a numerical vector with the computed eql values for the given set of parameter values. For one-dimensional variance families (i.e. those families with only one parameter), a smoothing operation can be performed to obtain intermediate values.
`param`	a `data.frame` containing the values of the parameters at which the eql function was evaluated.
`eql.max`	the maximum value of the eql function in the considered range.
`param.max`	a `data.frame` containing the values of the parameters at which the maximum is obtained.
`dim`	an integer value giving the dimension of the parameters in the underlying variance family.
`smooth`	a logical value indicating whether a smoothing operation was performed.
`is.smoothed`	a vector of logical values of the same length as `eql` indicating if the particular EQL value was obtained by smoothing or was calculated directly.
`smooth.grid`	an integer value giving the number of points used in the smoothing process or `NULL` if no smoothing was performed.
`model`	if `include.model` is `TRUE`, the GLM for which the maximum EQL value was archieved, `NULL` otherwise.

Note

The EQL for variance functions with V_\theta(0)=0 becomes infinite. Hence, if there are exact zeros in the data, one should provide a variance family, which do not equate to zero at the origin. Nelder and Pregibon propose some adjustment of V(y) at the origin, which leads to a modified variance function.

The predefined families powerVarianceFamily and extBinomialVarianceFamily are, however, not capable of dealing with exact zeros, for there is no general mechanism to modify the variance function for all possible values of the particular variance family.

The confidence interval for one-dimensional variance families is not calculated exactly, but depends on the amount of EQL values available. Hence, if one is interested in a confidence interval, one should allow for smoothing.

The function eql does not use a direct maximization routine, but rather do a simple maximation over a finite set. Hence, all obtained values including confidence intervals and confidence ellipsoids have a “local flavour” and should not be regarded as global solutions.

The confidence bounds are determined rather empirically and do heavily depend on the amount of parameter values under consideration.

Author(s)

Thorn Thaler

References

Nelder, J.A. and Pregibon, D. (1987). An extended quasi-likelihood function. Biometrika, 74, 221–232.

Examples

## Power Variance Family
# Data from Box and Cox (1964)
x <- (-1:1)
y <- c(674,370,292,338,266,210,170,118,90,1414,1198,634,1022,620,438,
       442,332,220,3636,3184,2000,1568,1070,566,1140,884,360)
yarn.raw <- data.frame(expand.grid(x3=x, x2=x, x1=x), cycles=y)
yarn <- data.frame(x1=yarn.raw$x1, x2=yarn.raw$x2, x3=yarn.raw$x3,
   cycles=yarn.raw$cycles)
attach(yarn)

ps.power <- list(theta=seq(1, 4, length = 20))
eq.power <- eql(cycles~x1+x2+x3, param.space=ps.power,
   family=powerVarianceFamily("log"), smooth.grid=500)
plot(eq.power)

## Not run: 
## Extended Binomial Variance Family
# Data from McCullagh & Nelder: GLM, p. 329
# (zeros replaced by 'NA')

site <- rep(1:9, each=10)
variety <- rep(1:10, 9)
resp <- c(0.05,NA,NA,0.10,0.25,0.05,0.50,1.30,1.50,1.50,
   NA,0.05,0.05,0.30,0.75,0.30,3,7.50,1,12.70,1.25,1.25,
   2.50,16.60,2.50,2.50,NA,20,37.50,26.25,2.50,0.50,0.01,
   3,2.50,0.01,25,55,5,40,5.50,1,6,1.10,2.50,8,16.50,
   29.50,20,43.50,1,5,5,5,5,5,10,5,50,75,5,0.10,5,5,
   50,10,50,25,50,75,5,10,5,5,25,75,50,75,75,75,17.50,
   25,42.50,50,37.50,95,62.50,95,95,95) / 100

ps.binomial <- list(seq(1, 2.2, length=32), seq(1, 3, length=32))
eq.binomial <- eql(resp~site*variety, param.space=ps.binomial,
   family=extBinomialVarianceFamily())
plot(eq.binomial)

## End(Not run)

[Package EQL version 1.0-1 Index]