Influence plots {reverseR} | R Documentation |
Several diagnostic plots for checking p-value influencers
Description
Seven different plot types that visualize p-value influencers.
1. lmPlot
: plots the linear regression, marks the influencer(s) in red and displays trend lines for the full and leave-one-out (LOO) data set (black and red, respectively).
2. pvalPlot
: plots the p-values for each LOO data point and displays the values as a full model/LOO model plot, together with the alpha
border as defined in lmInfl
.
3. inflPlot
: plots dfbeta
for slope, dffits
, covratio
, cooks.distance
, leverage (hatvalues
) and studentized residuals (rstudent
) against the \Delta
p-value. Herewith, changes in these six parameters can be compared to the effect on the corresponding drop/rise in p-value. The plots include vertical boundaries for threshold values as defined in the literature under 'References'.
4. slsePlot
: plots all LOO-slopes and their standard errors together with the corresponding original model values and a t-value border as calculated by \mathit{Q_t}(1 - \frac{\alpha}{2}, n-2)
. LOO of points on the right of this border result in a significant model, and vice versa.
5. threshPlot
: plots the output of lmThresh
, i.e. the regression plot including confidence/prediction intervals, as well as for each response value y_i
the region in which the model is significant (green). This is tested for either i) y_i
that are shifted into this region (newobs = FALSE
in lmThresh
) or ii) when a new observation y2_i
is added (newobs = TRUE
in lmThresh
). In the latter case, it is informative if this region resides within the prediction interval (dashed line), indicating that a future additional measurement at x_i
might reverse the significance statement.
6. multPlot
: plots the output of lmMult
as a point cloud of p-values for each 1...max
sample removals and n
combinations. All combinations for which the sample removal resulted in a significance reversal are colored in red, the percentages of these are given on top of the plot.
7. stabPlot
: for single (to be selected) response values from the output of lmThresh
, this function displays the region of significance reversal within the surrounding prediction interval. The probability of a either shifting the response value (if lmThresh(..., newobs = FALSE)
) or of including a future (measurement) point (if lmThresh(..., newobs = TRUE)
) to reverse the significance is shown as the integral between the "end of significance region" (eosr) and the nearest prediction interval boundary.
NOTE: The visual display should always be supplemented with the corresponding stability
analysis.
Usage
lmPlot(infl, ...)
pvalPlot(infl, ...)
inflPlot(infl, ...)
slsePlot(infl, ...)
threshPlot(thresh, bands = FALSE, ...)
multPlot(mult, log = FALSE, ...)
stabPlot(stab, which = NULL, ...)
Arguments
infl |
an object obtained from |
thresh |
an object obtained from |
stab |
an object obtained from using |
bands |
logical. If |
mult |
an object obtained from |
log |
should the p-values be displayed on a logarithmic y-axis? |
which |
which response value should be shown in |
... |
other plotting parameters. |
Value
The corresponding plot.
Note
Cut-off values for the different influence measures are those defined in Belsley, Kuh E & Welsch (1980):
dfbeta slope: | \Delta\beta1_i | > 2/\sqrt{n}
dffits: | \mathrm{dffits}_i | > 2\sqrt{2/n}
covratio: |\mathrm{covr}_i - 1| > 3k/n
Cook's D: D_i > Q_F(0.5, k, n - k)
leverage: h_{ii} > 2k/n
studentized residual: t_i > Q_t(0.975, n - k - 1)
Author(s)
Andrej-Nikolai Spiess
References
Regression diagnostics: Identifying influential data and sources of collinearity.
Belsley DA, Kuh E, Welsch RE.
John Wiley, New York (1980).
Applied Regression Analysis: A Research Tool.
Rawlings JO, Pantula SG, Dickey DA.
Springer; 2nd Corrected ed. 1998. Corr. 2nd printing 2001.
Applied Regression Analysis and Generalized Linear Models.
Fox J.
SAGE Publishing, 3rd ed, 2016.
Examples
## See Examples in 'lmInfl', 'lmThresh' and 'lmMult'.