R: Residual Values based on 'rma' Objects

residuals.rma {metafor}

R Documentation

Residual Values based on 'rma' Objects

Description

Functions to compute residuals and standardized versions thereof for models fitted with the rma.uni, rma.mh, rma.peto, and rma.mv functions.

Usage

## S3 method for class 'rma'
residuals(object, type="response", ...)

## S3 method for class 'rma.uni'
rstandard(model, digits, type="marginal", ...)
## S3 method for class 'rma.mh'
rstandard(model, digits, ...)
## S3 method for class 'rma.peto'
rstandard(model, digits, ...)
## S3 method for class 'rma.mv'
rstandard(model, digits, cluster, ...)

## S3 method for class 'rma.uni'
rstudent(model, digits, progbar=FALSE, ...)
## S3 method for class 'rma.mh'
rstudent(model, digits, progbar=FALSE, ...)
## S3 method for class 'rma.peto'
rstudent(model, digits, progbar=FALSE, ...)
## S3 method for class 'rma.mv'
rstudent(model, digits, progbar=FALSE, cluster,
         reestimate=TRUE, parallel="no", ncpus=1, cl, ...)

Arguments

`object`	an object of class `"rma"` (for `residuals`).
`type`	the type of residuals which should be returned. For `residuals`, the alternatives are: `"response"` (default), `"rstandard"`, `"rstudent"`, and `"pearson"`. For `rstandard.rma.uni`, the alternatives are: `"marginal"` (default) and `"conditional"`. See ‘Details’.
`model`	an object of class `"rma"` (for `residuals`) or an object of class `"rma.uni"`, `"rma.mh"`, `"rma.peto"`, or `"rma.mv"` (for `rstandard` and `rstudent`).
`cluster`	optional vector to specify a clustering variable to use for computing cluster-level multivariate standardized residuals (only for `"rma.mv"` objects).
`reestimate`	logical to specify whether variance/correlation components should be re-estimated after deletion of the \(i\textrm{th}\) case when computing externally standardized residuals for `"rma.mv"` objects (the default is `TRUE`).
`parallel`	character string to specify whether parallel processing should be used (the default is `"no"`). For parallel processing, set to either `"snow"` or `"multicore"`. See ‘Note’.
`ncpus`	integer to specify the number of processes to use in the parallel processing.
`cl`	optional cluster to use if `parallel="snow"`. If unspecified, a cluster on the local machine is created for the duration of the call.
`digits`	optional integer to specify the number of decimal places to which the printed results should be rounded. If unspecified, the default is to take the value from the object.
`progbar`	logical to specify whether a progress bar should be shown (only for `rstudent`) (the default is `FALSE`).
`...`	other arguments.

Details

The observed residuals (obtained with residuals) are simply equal to the ‘observed - fitted’ values. These can be obtained with residuals(object) (using the default type="response").

Dividing the observed residuals by the model-implied standard errors of the observed effect sizes or outcomes yields Pearson (or semi-standardized) residuals. These can be obtained with residuals(object, type="pearson").

Dividing the observed residuals by their corresponding standard errors yields (internally) standardized residuals. These can be obtained with rstandard(model) or residuals(object, type="rstandard").

With rstudent(model) (or residuals(object, type="rstudent")), one can obtain the externally standardized residuals (also called standardized deleted residuals or (externally) studentized residuals). The externally standardized residual for the \(i\textrm{th}\) case is obtained by deleting the \(i\textrm{th}\) case from the dataset, fitting the model based on the remaining cases, calculating the predicted value for the \(i\textrm{th}\) case based on the fitted model, taking the difference between the observed and the predicted value for the \(i\textrm{th}\) case (which yields the deleted residual), and then standardizing the deleted residual based on its standard error.

If a particular case fits the model, its standardized residual follows (asymptotically) a standard normal distribution. A large standardized residual for a case therefore may suggest that the case does not fit the assumed model (i.e., it may be an outlier).

For "rma.uni" objects, rstandard(model, type="conditional") computes conditional residuals, which are the deviations of the observed effect sizes or outcomes from the best linear unbiased predictions (BLUPs) of the study-specific true effect sizes or outcomes (see blup).

For "rma.mv" objects, one can specify a clustering variable (via the cluster argument). If specified, rstandard(model) and rstudent(model) also compute cluster-level multivariate (internally or externally) standardized residuals. If all outcomes within a cluster fit the model, then the multivariate standardized residual for the cluster follows (asymptotically) a chi-square distribution with \(k_i\) degrees of freedom (where \(k_i\) denotes the number of outcomes within the cluster).

See also influence.rma.uni and influence.rma.mv for other leave-one-out diagnostics that are useful for detecting influential cases in models fitted with the rma.uni and rma.mv functions.

Value

Either a vector with the residuals of the requested type (for residuals) or an object of class "list.rma", which is a list containing the following components:

`resid`	observed residuals (for `rstandard`) or deleted residuals (for `rstudent`).
`se`	corresponding standard errors.
`z`	standardized residuals (internally standardized for `rstandard` or externally standardized for `rstudent`).

When a clustering variable is specified for "rma.mv" objects, the returned object is a list with the first element (named obs) as described above and a second element (named cluster of class "list.rma" with:

`X2`	cluster-level multivariate standardized residuals.
`k`	number of observed effect sizes or outcomes within the clusters.

The object is formatted and printed with print. To format the results as a data frame, one can use the as.data.frame function.

Note

The externally standardized residuals (obtained with rstudent) are calculated by refitting the model \(k\) times (where \(k\) denotes the number of cases). Depending on how large \(k\) is, it may take a few moments to finish the calculations. For complex models fitted with rma.mv, this can become computationally expensive.

On machines with multiple cores, one can try to speed things up by delegating the model fitting to separate worker processes, that is, by setting parallel="snow" or parallel="multicore" and ncpus to some value larger than 1 (only for objects of class "rma.mv"). Parallel processing makes use of the parallel package, using the makePSOCKcluster and parLapply functions when parallel="snow" or using mclapply when parallel="multicore" (the latter only works on Unix/Linux-alikes). With parallel::detectCores(), one can check on the number of available cores on the local machine.

Alternatively (or in addition to using parallel processing), one can also set reestimate=FALSE, in which case any variance/correlation components in the model are not re-estimated after deleting the \(i\textrm{th}\) case from the dataset. Doing so only yields an approximation to the externally standardized residuals (and the cluster-level multivariate standardized residuals) that ignores the influence of the \(i\textrm{th}\) case on the variance/correlation components, but is considerably faster (and often yields similar results).

It may not be possible to fit the model after deletion of the \(i\textrm{th}\) case from the dataset. This will result in NA values for that case when calling rstudent.

Also, for "rma.mv" objects with a clustering variable specified, it may not be possible to compute the cluster-level multivariate standardized residual for a particular cluster (if the var-cov matrix of the residuals within a cluster is not of full rank). This will result in NA for that cluster.

The variable specified via cluster is assumed to be of the same length as the data originally passed to the rma.mv function (and if the data argument was used in the original model fit, then the variable will be searched for within this data frame first). Any subsetting and removal of studies with missing values that was applied during the model fitting is also automatically applied to the variable specified via the cluster argument.

For objects of class "rma.mh" and "rma.peto", rstandard actually computes Pearson (or semi-standardized) residuals.

Author(s)

Wolfgang Viechtbauer wvb@metafor-project.org https://www.metafor-project.org

References

Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. San Diego, CA: Academic Press.

Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1–48. ⁠https://doi.org/10.18637/jss.v036.i03⁠

Viechtbauer, W. (2021). Model checking in meta-analysis. In C. H. Schmid, T. Stijnen, & I. R. White (Eds.), Handbook of meta-analysis (pp. 219–254). Boca Raton, FL: CRC Press. ⁠https://doi.org/10.1201/9781315119403⁠

Viechtbauer, W., & Cheung, M. W.-L. (2010). Outlier and influence diagnostics for meta-analysis. Research Synthesis Methods, 1(2), 112–125. ⁠https://doi.org/10.1002/jrsm.11⁠

Examples

### calculate log risk ratios and corresponding sampling variances
dat <- escalc(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg)

### fit random-effects model
res <- rma(yi, vi, data=dat)

### compute the studentized residuals
rstudent(res)

### fit mixed-effects model with absolute latitude as moderator
res <- rma(yi, vi, mods = ~ ablat, data=dat)

### compute the studentized residuals
rstudent(res)

[Package metafor version 4.6-0 Index]