R: Residuals for a VGLM fit

residualsvglm {VGAM}

R Documentation

Residuals for a VGLM fit

Description

Residuals for a vector generalized linear model (VGLM) object.

Usage

residualsvglm(object, type = c("working", "pearson", "response",
   "deviance", "ldot", "stdres", "rquantile"), matrix.arg = TRUE)

Arguments

object

Object of class "vglm", i.e., a vglm fit.

type

The value of this argument can be abbreviated. The type of residuals to be returned. The default is the first one: working residuals corresponding to the IRLS algorithm. These are defined for all models. They are sometimes added to VGAM plots of estimated component functions (see plotvgam).

Pearson residuals for GLMs, when squared and summed over the data set, total to the Pearson chi-squared statistic. For VGLMs, Pearson residuals involve the working weight matrices and the score vectors. Under certain limiting conditions, Pearson residuals have 0 means and identity matrix as the variance-covariance matrix.

Response residuals are simply the difference between the observed values and the fitted values. Both have to be of the same dimension, hence not all families have response residuals defined.

Deviance residuals are only defined for models with a deviance function. They tend to GLMs mainly. This function returns a NULL for those models whose deviance is undefined.

Randomized quantile residuals (RQRs) (Dunn and Smyth, 1996) are based on the p-type function being fed into qnorm. For example, for the default exponential it is qnorm(pexp(y, rate = 1 / fitted(object))). So one should expect these residuals to have a standard normal distribution if the model and data agree well. If the distribution is discrete then randomized values are returned; see runif and set.seed. For example, for the default poissonff it is qnorm(runif(length(y), ppois(y - 1, mu), ppois(y, mu))) where mu is the fitted mean. The following excerpts comes from their writings. They highly recommend quantile residuals for discrete distributions since plots using deviance and Pearson residuals may contain distracting patterns. Four replications of the quantile residuals are recommended with discrete distributions because they have a random component. Any features not preserved across all four sets of residuals are considered artifacts of the randomization. This type of residual is continuous even for discrete distributions; for both discrete and continuous distributions, the quantile residuals have an exact standard normal distribution.

The choice "ldot" should not be used currently.

Standardized residuals are currently only defined for 2 types of models: (i) GLMs (poissonff, binomialff); (ii) those fitted to a two-way table of counts, e.g., cumulative, acat, multinomial, sratio, cratio. For (ii), they are defined in Section 2.4.5 of Agresti (2018) and are also the output from the "stdres" component of chisq.test. For the test of independence they are a useful type of residual. Their formula is (observed - expected) / sqrt(V), where V is the residual cell variance (also see Agresti, 2007, section 2.4.5). When an independence null hypothesis is true, each standardized residual (corresponding to a cell in the table) has a a large-sample standard normal distribution. Currently this function merely extracts the table of counts from object and then computes the standardized residuals like chisq.test.

matrix.arg

Logical, which applies when if the pre-processed answer is a vector or a 1-column matrix. If TRUE then the value returned will be a matrix, else a vector.

Details

This function returns various kinds of residuals, sometimes depending on the specific type of model having been fitted. Section 3.7 of Yee (2015) gives some details on several types of residuals defined for the VGLM class.

Standardized residuals for GLMs are described in Section 4.5.6 of Agresti (2013) as the ratio of the raw (response) residuals divided by their standard error. They involve the generalized hat matrix evaluated at the final IRLS iteration. When applied to the LM, standardized residuals for GLMs simplify to rstandard. For GLMs they are basically the Pearson residual divided by the square root of 1 minus the leverage.

Value

If that residual type is undefined or inappropriate or not yet implemented, then NULL is returned, otherwise a matrix or vector of residuals is returned.

Warning

This function may change in the future, especially those whose definitions may change.

References

Agresti, A. (2007). An Introduction to Categorical Data Analysis, 2nd ed., New York: John Wiley & Sons. Page 38.

Agresti, A. (2013). Categorical Data Analysis, 3rd ed., New York: John Wiley & Sons.

Agresti, A. (2018). An Introduction to Categorical Data Analysis, 3rd ed., New York: John Wiley & Sons.

Dunn, P. K. and Smyth, G. K. (1996). Randomized quantile residuals. Journal of Computational and Graphical Statistics, 5, 236–244.

Examples

pneumo <- transform(pneumo, let = log(exposure.time))
fit <- vglm(cbind(normal, mild, severe) ~ let, propodds, pneumo)
resid(fit)  # Same as having type = "working" (the default)
resid(fit, type = "response")
resid(fit, type = "pearson")
resid(fit, type = "stdres")  # Test for independence

[Package VGAM version 1.1-11 Index]