anscombe_leverage {quartets} | R Documentation |
Anscombe's Quartet High Leverage Data
Description
This dataset contains 11 observations generated by Francis Anscombe to
demonstrate that statistical summary measures alone cannot capture the full
relationship between two variables (here, x
and y
). Anscombe emphasized
the importance of visualizing data prior to calculating summary statistics.
Usage
anscombe_leverage
Format
A dataframe with 11 rows and 2 variables:
-
x
: the x-variable -
y
: the y-variable
Details
This Dataset has a no relationship between x
and y
with a single
high leverage point
Additionally, the following statistical summaries hold:
mean of
x
: 9variance of
x
: 11mean of
y
: 7.5variance of y: 4.125
correlation between
x
andy
: 0.816linear regression between
x
andy
:y = 3 + 0.5x
-
R^2
for the regression: 0.67
References
Anscombe, F. J. (1973). "Graphs in Statistical Analysis". American Statistician. 27 (1): 17–21. doi:10.1080/00031305.1973.10478966. JSTOR 2682899.