genCorrelatedData {rockchalk} | R Documentation |
Generates a data frame for regression analysis
Description
The output is a data frame (x1, x2, y) with user-specified
correlation between x1 and x2. The y (output) variable is created
according to the equation
y = beta1 + beta2 * x1 + beta3 * x2 + beta4 * x1 * x2 + e.
The arguments determine the scales of the X matrix, the random error, and the slope coefficients.
Usage
genCorrelatedData(
N = 100,
means = c(50, 50),
sds = c(10, 10),
rho = 0,
stde = 1,
beta = c(0, 0.2, 0.2, 0)
)
Arguments
N |
Number of cases desired |
means |
2-vector of means for x1 and x2 |
sds |
2-vector of standard deviations for x1 and x2 |
rho |
Correlation coefficient for x1 and x2 |
stde |
standard deviation of the error term in the data generating equation |
beta |
beta vector of at most 4 coefficients for intercept, slopes, and interaction |
Details
The vector (x1,x2) is drawn from a multivariate normal
distribution in which the expected value (argument means
).
The covariance matrix of X is
built from the standard deviations (sds
)
and the specified correlation between x1 and x2 (rho
).
It is also necessary to specify the standard deviation
of the error term (stde
) and the coefficients
of the regression equation (beta
).
Examples
## 1000 observations of uncorrelated x1 and x2 with no
## interaction between x1 and x2
dat <- genCorrelatedData(N=1000, rho=0, beta=c(1, 1.0, -1.1, 0.0))
mcGraph1(dat$x1, dat$x2, dat$y, theta=20, phi=8,
ticktype="detailed", nticks=10)
m1 <- lm(y ~ x1 + x2, data = dat)
plotPlane(m1, plotx1 = "x1", plotx2 = "x2")