R: Apply Goodness of Fit Test to Residuals of a Linear Model

testLMNormal {gofedf}

R Documentation

Apply Goodness of Fit Test to Residuals of a Linear Model

Description

testLMNormal is used to check the normality assumption of residuals in a linear model. This function can take the response variable and design matrix, fit a linear model, and apply the goodness-of-fit test. Conveniently, it can take an object of class "lm" and directly applies the goodness-of-fit test. The function returns a goodness-of-fit statistic along with an approximate pvalue.

Usage

testLMNormal(
  x,
  y,
  fit = NULL,
  ngrid = length(y),
  gridpit = FALSE,
  hessian = FALSE,
  method = "cvm"
)

Arguments

`x`	is either a numeric vector or a design matrix. In the design matrix, rows indicate observations and columns presents covariats.
`y`	is a vector of numeric values with the same number of observations or number of rows as x.
`fit`	an object of class "lm" returned by `lm` function in `stats` package. The default value of fit is NULL. If any object is provided, `x` and `y` will be ignored and the class of object is checked. If you pass an object to `fit` make sure to return the design matrix by setting `x` = `TRUE` and the response variable by setting in `y` = `TRUE` in `lm` function. To read more about this see the help documentation for `lm` function or see the example below.
`ngrid`	the number of equally spaced points to discretize the (0,1) interval for computing the covariance function.
`gridpit`	logical. If `TRUE` (the default value), the parameter ngrid is ignored and (0,1) interval is divided based on probability inverse transformed values obtained from the sample. If `FALSE`, the interval is divided into ngrid equally spaced points for computing the covariance function.
`hessian`	logical. If `TRUE` the Fisher information matrix is estimated by the observed Hessian Matrix based on the sample. If `FALSE` (the default value) the Fisher information matrix is estimated by the variance of the observed score matrix.
`method`	a character string indicating which goodness-of-fit statistic is to be computed. The default value is 'cvm' for the Cramer-von-Mises statistic. Other options include 'ad' for the Anderson-Darling statistic, and 'both' to compute both cvm and ad.

Value

A list of two containing the following components:

Statistic: the value of goodness-of-fit statistic.
p-value: the approximate p-value for the goodness-of-fit test based on empirical distribution function. if method = 'cvm' or method = 'ad', it returns a numeric value for the statistic and p-value. If method = 'both', it returns a numeric vector with two elements and one for each statistic.

Examples

set.seed(123)
n <- 50
p <- 5
x <- matrix( runif(n*p), nrow = n, ncol = p)
e <- rnorm(n)
b <- runif(p)
y <- x %*% b + e
testLMNormal(x, y)
# Or pass lm.fit object directly:
lm.fit <- lm(y ~ x, x = TRUE, y = TRUE)
testLMNormal(fit = lm.fit)

[Package gofedf version 0.1.0 Index]