R: Several diagnostic plots for checking p-value influencers

Influence plots {reverseR}

R Documentation

Several diagnostic plots for checking p-value influencers

Description

Seven different plot types that visualize p-value influencers.

1. lmPlot: plots the linear regression, marks the influencer(s) in red and displays trend lines for the full and leave-one-out (LOO) data set (black and red, respectively).
2. pvalPlot: plots the p-values for each LOO data point and displays the values as a full model/LOO model plot, together with the alpha border as defined in lmInfl.
3. inflPlot: plots dfbeta for slope, dffits, covratio, cooks.distance, leverage (hatvalues) and studentized residuals (rstudent) against the \Deltap-value. Herewith, changes in these six parameters can be compared to the effect on the corresponding drop/rise in p-value. The plots include vertical boundaries for threshold values as defined in the literature under 'References'.
4. slsePlot: plots all LOO-slopes and their standard errors together with the corresponding original model values and a t-value border as calculated by \mathit{Q_t}(1 - \frac{\alpha}{2}, n-2). LOO of points on the right of this border result in a significant model, and vice versa.
5. threshPlot: plots the output of lmThresh, i.e. the regression plot including confidence/prediction intervals, as well as for each response value y_i the region in which the model is significant (green). This is tested for either i) y_i that are shifted into this region (newobs = FALSE in lmThresh) or ii) when a new observation y2_i is added (newobs = TRUE in lmThresh). In the latter case, it is informative if this region resides within the prediction interval (dashed line), indicating that a future additional measurement at x_i might reverse the significance statement.
6. multPlot: plots the output of lmMult as a point cloud of p-values for each 1...max sample removals and n combinations. All combinations for which the sample removal resulted in a significance reversal are colored in red, the percentages of these are given on top of the plot.
7. stabPlot: for single (to be selected) response values from the output of lmThresh, this function displays the region of significance reversal within the surrounding prediction interval. The probability of a either shifting the response value (if lmThresh(..., newobs = FALSE)) or of including a future (measurement) point (if lmThresh(..., newobs = TRUE)) to reverse the significance is shown as the integral between the "end of significance region" (eosr) and the nearest prediction interval boundary.

NOTE: The visual display should always be supplemented with the corresponding stability analysis.

Usage

lmPlot(infl, ...) 
pvalPlot(infl, ...) 
inflPlot(infl, ...)
slsePlot(infl, ...)
threshPlot(thresh, bands = FALSE, ...)
multPlot(mult, log = FALSE, ...)
stabPlot(stab, which = NULL, ...)

Arguments

`infl`	an object obtained from `lmInfl`.
`thresh`	an object obtained from `lmThresh`.
`stab`	an object obtained from using `stability` on an `lmThresh` output.
`bands`	logical. If `TRUE`, plots the confidence and prediction bands.
`mult`	an object obtained from `lmMult`.
`log`	should the p-values be displayed on a logarithmic y-axis?
`which`	which response value should be shown in `stabPlot`?
`...`	other plotting parameters.

Value

The corresponding plot.

Note

Cut-off values for the different influence measures are those defined in Belsley, Kuh E & Welsch (1980):

dfbeta slope: | \Delta\beta1_i | > 2/\sqrt{n}
dffits: | \mathrm{dffits}_i | > 2\sqrt{2/n}
covratio: |\mathrm{covr}_i - 1| > 3k/n
Cook's D: D_i > Q_F(0.5, k, n - k)
leverage: h_{ii} > 2k/n
studentized residual: t_i > Q_t(0.975, n - k - 1)

Author(s)

Andrej-Nikolai Spiess

References

Regression diagnostics: Identifying influential data and sources of collinearity.
Belsley DA, Kuh E, Welsch RE.
John Wiley, New York (1980).

Applied Regression Analysis: A Research Tool.
Rawlings JO, Pantula SG, Dickey DA.
Springer; 2nd Corrected ed. 1998. Corr. 2nd printing 2001.

Applied Regression Analysis and Generalized Linear Models.
Fox J.
SAGE Publishing, 3rd ed, 2016.

Examples

## See Examples in 'lmInfl', 'lmThresh' and 'lmMult'.

[Package reverseR version 0.1 Index]