lmExact {reverseR} | R Documentation |
Create random values that deliver linear regressions with exact parameters
Description
Takes self-supplied x/y values or x/random values and transforms these as to deliver linear regressions y = \beta_0 + \beta_1x + \varepsilon
(with potential replicates) with either
1) exact slope \beta_1
and intercept \beta_0
,
2) exact p-value and intercept \beta_0
, or
3) exact R^2
and intercept \beta_0
.
Intended for testing and education, not for cheating ! ;-)
Usage
lmExact(x = 1:20, y = NULL, ny = 1, intercept = 0, slope = 0.1, error = 0.1,
seed = 123, pval = NULL, rsq = NULL, plot = TRUE, verbose = FALSE, ...)
Arguments
x |
the predictor values. |
y |
|
ny |
the number of replicate response values per predictor value. |
intercept |
the desired intercept |
slope |
the desired slope |
error |
if a single value, the standard deviation |
seed |
the random generator seed for reproducibility. |
pval |
the desired p-value of the slope. |
rsq |
the desired |
plot |
logical. If |
verbose |
logical. If |
... |
Details
For case 1), the error
values are added to the exact (x_i, \beta_0 + \beta_1 x_i)
values, the linear model y_i = \beta_0 + \beta_1 x_i + \varepsilon
is fit, and the residuals y_i - \hat{y_i}
are re-added to (x_i, \beta_0 + \beta_1 x_i)
.
For case 2), the same as in 1) is conducted, however the slope delivering the desired p-value is found by an optimizing algorithm.
Finally, for case 3), a QR reconstruction, rescaling and refitting is conducted, using the code found under 'References'.
If y
is supplied, changes in slope, intercept and p-value will deliver the sames residuals as the linear regression through x
and y
. A different R^2
will change the response value structure, however.
Value
A list with the following items:
lm |
the linear model of class |
x |
the predictor values. |
y |
the (random) response values. |
summary |
the model summary for quick checking of obtained parameters. |
Using both x
and y
will give a linear regression with the desired parameter values when refitted.
Author(s)
Andrej-Nikolai Spiess
References
For method 3):
http://stats.stackexchange.com/questions/15011/generate-a-random-variable-with-a-defined-correlation-to-an-existing-variable.
Examples
## No replicates, intercept = 3, slope = 0.2, sigma = 2, n = 20.
res1 <- lmExact(x = 1:20, ny = 1, intercept = 3, slope = 2, error = 2)
## Same as above, but with 3 replicates, sigma = 1, n = 20.
res2 <- lmExact(x = 1:20, ny = 3, intercept = 3, slope = 2, error = 1)
## No replicates, intercept = 2 and p-value = 0.025, sigma = 3, n = 50.
## => slope = 0.063
res3 <- lmExact(x = 1:50, ny = 1, intercept = 2, pval = 0.025, error = 3)
## 5 replicates, intercept = 1, R-square = 0.85, sigma = 2, n = 10.
## => slope = 0.117
res4 <- lmExact(x = 1:10, ny = 5, intercept = 1, rsq = 0.85, error = 2)
## Heteroscedastic (magnitude-dependent) noise.
error <- sapply(1:20, function(x) rnorm(3, 0, x/10))
res5 <- lmExact(x = 1:20, ny = 3, intercept = 1, slope = 0.2,
error = error)
## Supply own x/y values, residuals are similar to an
## initial linear regression.
X <- c(1.05, 3, 5.2, 7.5, 10.2, 11.7)
set.seed(123)
Y <- 0.5 + 2 * X + rnorm(6, 0, 2)
res6 <- lmExact(x = X, y = Y, intercept = 1, slope = 0.2)
all.equal(residuals(lm(Y ~ X)), residuals(res6$lm))