influence_plot {regclass} | R Documentation |
Influence plot for regression diganostics
Description
This function plots the leverage vs. deleted studentized residuals for a regression model, highlighting points that are influent based on these two factors as well as Cook's distance
Usage
influence_plot(M,large.cook,cooks=FALSE)
Arguments
M |
A linear regression model fitted with lm() |
large.cook |
The threshold for a "large" Cook's distance. If not specified, a default of 4/n is used. |
cooks |
|
Details
A point is influential if its addition to the data changes the regression substantially. One way of measuring influence is by looking at the point's leverage (distance from the center of the predictor's datacloud with respect to it shape) and deleted studentized residual (relative size of the residual with respect to a regression made without that point). Points with leverages larger than 2(k+1)/n (where k is the number of predictors) and deleted studentized residuals larger than 2 in magnitude are considered influential.
Influence can also be measured by Cook's distance, which essentially combines the above two measures. This function considers the Cook's distances to be large when it exceeds 4/n, but the user can specify another cutoff.
The radius of a point is proportional to the square root of the Cook's distance. Influential points according to leverage/residual criteria have an X through them while influential points according to Cook's distance are bolded.
The function returns the row numbers of influential observations.
Value
A list with the row numbers of influential points according to Cook's distance ($Cooks
) and according to leverage/residual criteria ($Leverage
).
Author(s)
Adam Petrie
References
Introduction to Regression and Modeling
See Also
cooks.distance
, hatvalues
, rstudent
Examples
data(TIPS)
M <- lm(TipPercentage~.-Tip,data=TIPS)
influence_plot(M)