regress {mStats} | R Documentation |
Linear Regression Model
Description
regress()
produces summary of the model
with coefficients and 95% Confident Intervals.
`predict.regress`
a S3 method for predict
to generate
statistics related to the prediction of the linear model using the output
from the regress
function of the mStats
.
`plot.regress`
is a S3 method for plot()
to create
graphs for checking diagnostics of linear model using the output from
the regress
function of the mStats
.
`ladder`
converts a variable into a normally
distributed one.
`hettest`
performs the Breusch-Pagan test
for heteroskedasticity.
It presents evidence against the
null hypothesis that t=0 in Var(e)=sigma^2 exp(zt).
The formula are based on the bptest
function
in lmtest
package.
`linkTest`
determines whether a model in R is
'well specified' using the STATA
's linkTest
.
Usage
regress(model, vce = FALSE, digits = 5)
## S3 method for class 'regress'
predict(object, ...)
## S3 method for class 'regress'
plot(x, ...)
ladder(data, var)
hettest(regress, studentize = FALSE)
linkTest(model, vce = FALSE, digits = 5)
Arguments
model |
glm or lm model |
vce |
if |
digits |
specify rounding of numbers. See |
object |
a model object for which prediction is desired. |
... |
additional arguments affecting the predictions produced. |
x |
the coordinates of points in the plot. Alternatively, a
single plotting structure, function or any R object with a
|
data |
dataset |
var |
variable name |
regress |
output from |
studentize |
logical.
If set to |
Details
regress
is based on lm
. All statistics presented
in the function's output are derivatives of lm
,
except AIC value which is obtained from AIC
.
It uses lm()
function to run the model.
Outputs
Outputs can be divided into three parts.
-
Info of the model
: Here provides number of observations (Obs.), F value, p-value from F test, R Squared value, Adjusted R Squared value, square root of mean square error (Root MSE) and AIC value. -
Errors
: Outputs fromanova(model)
is tabulated here. SS, DF and MS indicate sum of square of errors, degree of freedom and mean of square of errors. -
Regression Output
: Coefficients from summary of model are tabulated here along with 95\ confidence interval.
using Robust Standard Errors
if heteroskedasticity is present in our data sample, the ordinary least square (OLS) estimator will remain unbiased and consistent, but not efficient. The estimated OLS standard errors will be biased and cannot be solved with a larger sample size. To remedy this, robust standard errors can be used to adjusted standard errors.
The regress
uses sandwich estimator to estimate Huber-White's standard
errors. The calculation is based on the tutorial by Kevin Goulding.
Variance of Robust = (N / N - K) (X'X)^(-1)
\sum{Xi X'i ei^2} (X'X)^(-1)
where N = number of observations, and K = the number of regressors (including the intercept). This returns a Variance-covariance (VCV) matrix where the diagonal elements are the estimated heteroskedasticity-robust coefficient variances — the ones of interest. Estimated coefficient standard errors are the square root of these diagonal elements.
`predict.regress`
generates an original data with statistics for model
diagnostics:
-
fitted
(Fitted values) -
resid
(Residuals) -
std.resid
(Studentized Residuals) -
hat
(leverage) -
sigma
-
cooksd
(Cook's Distance)
The Breusch-Pagan test
fits a linear regression model
to the residuals of a linear regression model
(by default the same explanatory variables are taken as
in the main regression model) and rejects if too
much of the variance is explained by the additional
explanatory variables. Under H_0
the test statistic
of the Breusch-Pagan test follows a chi-squared distribution
with parameter
(the number of regressors without
the constant in the model) degrees of freedom.
The code for `linkTest`
has been modified from Keith Chamberlain's linktext.
www.ChamberlainStatistics.com
https://gist.github.com/KeithChamberlain/8d9da515e73a27393effa3c9fe571c3f
Value
a list containing
-
info
- info and error tables -
reg
- regression table -
model
- raw model output fromlm()
-
fit
- formula for fitting the model -
lbl
- variable labels for further processing insummary
.
Note
Credits to Kevin Goulding, The Tarzan Blog.
Author(s)
Email: dr.myominnoo@gmail.com
Website: https://myominnoo.github.io/
References
T.S. Breusch & A.R. Pagan (1979), A Simple Test for Heteroscedasticity and Random Coefficient Variation. Econometrica 47, 1287–1294
R. Koenker (1981), A Note on Studentizing a Test for Heteroscedasticity. Journal of Econometrics 17, 107–112.
W. Krämer & H. Sonnberger (1986), The Linear Regression Model under Test. Heidelberg: Physics
Examples
fit <- lm(Ozone ~ Wind, data = airquality)
regress(fit)
## Not run:
## labelling variables
airquality2 <- label(airquality, Ozone = "Ozone level", Wind = "Wind Speed")
fit2 <- lm(Ozone ~ Wind, data = airquality2)
reg <- regress(fit2)
str(reg)
## End(Not run)
## Not run:
predict(reg)
## End(Not run)
## Not run:
plot(reg)
## End(Not run)
ladder(airquality, Ozone)
## Not run:
hettest(reg)
## End(Not run)
## Not run:
linkTest(fit)
## End(Not run)