toy {collinear} | R Documentation |
One response and four predictors with varying levels of multicollinearity
Description
Data frame with known relationship between responses and predictors useful to illustrate multicollinearity concepts. Created from vi using the code shown in the example.
Usage
data(toy)
Format
Data frame with 2000 rows and 5 columns.
Details
Columns:
-
y
: response variable generated froma * 0.75 + b * 0.25 + noise
. -
a
: most important predictor ofy
, uncorrelated withb
. -
b
: second most important predictor ofy
, uncorrelated witha
. -
c
: generated froma + noise
. -
d
: generated from(a + b)/2 + noise
.
These are variance inflation factors of the predictors in toy
.
variable vif
b 4.062
d 6.804
c 13.263
a 16.161
Examples
library(collinear)
library(dplyr)
data(vi)
set.seed(1)
toy <- vi |>
dplyr::slice_sample(n = 2000) |>
dplyr::transmute(
a = soil_clay,
b = humidity_range
) |>
scale() |>
as.data.frame() |>
dplyr::mutate(
y = a * 0.75 + b * 0.25 + runif(n = dplyr::n(), min = -0.5, max = 0.5),
c = a + runif(n = dplyr::n(), min = -0.5, max = 0.5),
d = (a + b) / 2 + runif(n = dplyr::n(), min = -0.5, max = 0.5)
) |>
dplyr::transmute(y, a, b, c, d)
[Package collinear version 1.1.1 Index]