R: Nonlinear Heteroscedastic Model Diagnostics

nlreg.diag {nlreg}

R Documentation

Nonlinear Heteroscedastic Model Diagnostics

Description

Calculates different types of residuals, influence measures and leverages for a nonlinear heteroscedastic model.

Usage

  nlreg.diag(fitted, hoa = TRUE, infl = TRUE, trace = FALSE)

Arguments

`fitted`	a `nlreg` object, that is, the result of a call to `nlreg`.
`hoa`	logical value indicating whether higher order asymptotics should be used for calculating the regression diagnostics. Default is `TRUE`.
`infl`	logical value indicating whether influence measures should be calculated on the basis of a leave-one-out analysis. Default is `TRUE`.
`trace`	logical value. If `TRUE`, details of the iterations are printed. Default is `FALSE`.

Details

The regression diagnostics implemented in the nlreg.diag routine follow two approaches. The first exploits, where possible, the analogy with linear models, that is, it applies the classical definitions of residuals, leverages and Cook's distance after having linearized the nonlinear model through Taylor series expansion (Carroll and Ruppert, 1988, Section 2.8). The second approach uses the mean shift outlier model (Cook and Weisberg, 1982, Section 2.2.2), where a dummy variable is included for each observation at a time, the model refitted and the significance of the corresponding coefficient assessed.

The leverages are defined in analogy to the linear case (Brazzale, 2000, Appendix A.2.2). Two versions are available. In the first case the sub-block of the inverse of the expected information matrix corresponding to the regression coefficients is used in the definition. In the second case, this matrix is replaced by the inverse of M'WM, where M is the n\times p matrix whose ith row is the gradient of the mean function evaluated at the ith data point and W is a diagonal matrix whose elements are the inverses of the variance function evaluated at each data point.

If the model is correctly specified, all residuals follow the standard normal distribution. The second kind of leverages described above are used to calculate the approximate studentized residuals, whereas the generalized Pearson residuals use the first kind. The ith generalized Pearson residual can also be obtained as the score statistic for testing the significance of the dummy coefficient in the mean shift outlier model for observation i. Accordingly, the ith deletion and r^*-type residuals are defined as respectively the likelihood root and modified likelihood root statistics (r and r^*) for the same situation (Bellio, 2000, Section 2.6.1).

Different influence measures were implemented in nlreg.diag. If infl = TRUE, the global measure (Cook and Weisberg, 1982, Section 5.2) and two partial ones (Bellio, 2000, Section 2.6.2), the first measuring the influence of each observation on the regression coefficients and the second on the variance parameters, are returned. They are calculated through a leave-one-out analysis, where one observation at a time is deleted and the model refitted. In order to avoid a further model fit, the constrained maximum likelihood estimates that would be needed are approximated by means of a Taylor series expansion centered at the MLEs. If infl = FALSE, only an approximation to Cook's distance, obtained from a first order Taylor series expansion of the partial influence measure for the regression coefficients, is returned.

A detailed account of regression diagnostics can be found in Davison and Snell (1991) and Davison and Tsai (1992). The details and in particular the definitions of the above residuals and diagnostics are given in Brazzale (2000, Section 6.3.1 and Appendix A.2.2).

Value

Returns an object of class nlreg.diag with the following components:

`fitted`	the fitted values, that is, the mean function evaluated at each data point.
`resid`	the response (or standardized) residuals from the fit.
`rp`	the generalized Pearson residuals from the fit.
`rs`	the approximate studentized residuals from the fit.
`rj`	the deletion residuals from the fit; only if `hoa = TRUE`.
`rsj`	the `r^*`-type residuals from the fit; only if `hoa = TRUE`.
`h`	the leverages of the observations.
`ha`	the approximate leverages of the observations.
`cook`	an approximation to Cook's distance for the regression coefficients.
`ld`	the global influence of each observation; only for heteroscedastic errors and if `infl = TRUE`.
`ld.rc`	the partial influence of each observation on the estimates of the regression coefficients; only for heteroscedastic errors and if `infl = TRUE`.
`ld.vp`	the partial influence of each observation on the estimates of the variance parameters; only for heteroscedastic errors and if `infl = TRUE`.
`npar`	the number of regression coefficients.

Side Effects

If trace = TRUE, the number of the observation currently considered in the mean shift outlier model or omitted in the leave-one-out analysis (see Details section above) is printed; only if hoa = TRUE or infl = TRUE.

Acknowledgments

This function is based on A. J. Canty's function glm.diag contained in library boot.

Note

The calculation of the deletion and r^*-type residuals and of the influence measures can be time-consuming. In the first case, the mean shift outlier model has to be refitted as many times as the total number of observations. In the second case, the original model is refitted the same amount of times, where one observation at a time is deleted. Furthermore, the definition of the r^*-type residuals requires differentiation of the mean function of the mean shift outlier model. These calculations can be avoided by changing the default setting of the arguments hoa and infl to FALSE.

To obtain some of the regression diagnostics (typically those based on higher order statistics), the model is repeatedly refitted for different values of the mean shift outlier model parameter. Although rarely, convergence problems may occur as the starting values are chosen in an automatic way. A try construct is used to prevent the nlreg.diag method from breaking down. Hence, the values of the diagnostics are not available where a convergence problem was encountered. A warning is issued whenever this occurs.

References

Bellio, R. (2000) Likelihood Asymptotics: Applications in Biostatistics. Ph.D. Thesis, Department of Statistics, University of Padova.

Brazzale, A. R. (2000) Practical Small-Sample Parametric Inference. Ph.D. Thesis N. 2230, Department of Mathematics, Swiss Federal Institute of Technology Lausanne.

Carroll, R. J. and Ruppert, D. (1988) Transformation and Weighting in Regression. London: Chapman & Hall.

Cook, R. D. and Weisberg, S. (1982) Residuals and Influence in Regression. New York: Chapman & Hall.

Davison, A. C. and Snell, E. J. (1991) Residuals and diagnostics. In Statistical Theory and Modelling: In Honour of Sir David Cox (eds. D. V. Hinkley, N. Reid, and E. J. Snell), 83–106. London: Chapman & Hall.

Davison, A. C. and Tsai, C.-L. (1992) Regression model diagnostics. Int. Stat. Rev., 60, 337–353.

Examples

library(boot)
data(calcium)
calcium.nl <- nlreg( cal ~ b0*(1-exp(-b1*time)), weights = ~ ( 1+time^g )^2, 
                     data=calcium, start = c(b0 = 4, b1 = 0.1, g = 1), 
                     hoa = TRUE )
##
calcium.diag <- nlreg.diag( calcium.nl )
plot( calcium.diag, which = 9 )
##
calcium.diag <- nlreg.diag( calcium.nl, hoa = FALSE, infl = FALSE)
plot(calcium.diag, which = 9)
## Not available

[Package nlreg version 1.2-2.2 Index]