cooks.distance.overglm {glmtoolbox}R Documentation

Cook's Distance for alternatives to the Poisson and Binomial Regression Models under the presence of Overdispersion

Description

Produces an approximation, better known as the one-step approximation, of the Cook's distance, which is aimed to measure the effect on the estimates of the parameters in the linear predictor of deleting each observation in turn. This function also can produce an index plot of the Cook's distance for all parameters in the linear predictor or for some subset of them (via the argument coefs).

Usage

## S3 method for class 'overglm'
cooks.distance(model, plot.it = FALSE, coefs, identify, ...)

Arguments

model

an object of class overglm.

plot.it

an (optional) logical indicating if the plot is required or just the data matrix in which that plot is based. As default, plot.it is set to FALSE.

coefs

an (optional) character string which (partially) match with the names of some model parameters.

identify

an (optional) integer indicating the number of individuals to identify on the plot of the Cook's distance. This is only appropriate if plot.it=TRUE.

...

further arguments passed to or from other methods. If plot.it=TRUE then ... may be used to include graphical parameters to customize the plot. For example, col, pch, cex, main, sub, xlab, ylab.

Details

The Cook's distance consists of the distance between two estimates of the parameters in the linear predictor using a metric based on the (estimate of the) variance-covariance matrix. The first one set of estimates is computed from a dataset including all individuals, and the second one is computed from a dataset in which the i-th individual is excluded. To avoid computational burden, the second set of estimates is replaced by its one-step approximation. See the dfbeta.overglm documentation.

Value

A matrix as many rows as individuals in the sample and one column with the values of the Cook's distance.

Examples

###### Example 1: Self diagnozed ear infections in swimmers
data(swimmers)
fit1 <- overglm(infections ~ frequency + location, family="nb1(log)", data=swimmers)

### Cook's distance for all parameters in the linear predictor
cooks.distance(fit1, plot.it=TRUE, col="red", lty=1, lwd=1, col.lab="blue",
               col.axis="blue", col.main="black", family="mono", cex=0.8)

### Cook's distance just for the parameter associated with 'frequency'
cooks.distance(fit1, plot.it=TRUE, coef="frequency", col="red", lty=1, lwd=1,
   col.lab="blue", col.axis="blue", col.main="black", family="mono", cex=0.8)

###### Example 2: Article production by graduate students in biochemistry PhD programs
bioChemists <- pscl::bioChemists
fit2 <- overglm(art ~ fem + kid5 + ment, family="nb1(log)", data = bioChemists)

### Cook's distance for all parameters in the linear predictor
cooks.distance(fit2, plot.it=TRUE, col="red", lty=1, lwd=1, col.lab="blue",
               col.axis="blue", col.main="black", family="mono", cex=0.8)

### Cook's distance just for the parameter associated with 'fem'
cooks.distance(fit2, plot.it=TRUE, coef="fem", col="red", lty=1, lwd=1,
   col.lab="blue", col.axis="blue", col.main="black", family="mono", cex=0.8)

###### Example 3: Agents to stimulate cellular differentiation
data(cellular)
fit3 <- overglm(cbind(cells,200-cells) ~ tnf + ifn, family="bb(logit)", data=cellular)

### Cook's distance for all parameters in the linear predictor
cooks.distance(fit3, plot.it=TRUE, col="red", lty=1, lwd=1, col.lab="blue",
               col.axis="blue", col.main="black", family="mono", cex=0.8)

### Cook's distance just for the parameter associated with 'tnf'
cooks.distance(fit3, plot.it=TRUE, coef="tnf", col="red", lty=1, lwd=1,
  col.lab="blue", col.axis="blue", col.main="black", family="mono", cex=0.8)


[Package glmtoolbox version 0.1.12 Index]