Bayesian inference for multiple linear regression

Description

bayes.lm is used to fit linear models in the Bayesian paradigm. It can be used to carry out regression, single stratum analysis of variance and analysis of covariance (although these are not tested). This documentation is shamelessly adapated from the lm documentation

Usage

bayes.lm(
formula,
data,
subset,
na.action,
model = TRUE,
x = FALSE,
y = FALSE,
center = TRUE,
prior = NULL,
sigma = FALSE
)


Arguments

 formula an object of class formula (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’. data an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which bayes.lm is called. subset an optional vector specifying a subset of observations to be used in the fitting process. na.action a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is link[stats]{na.fail} if that is unset. The ‘factory-fresh’ default is na.omit. Another possible value is NULL, no action. Value na.exclude can be useful. model, x, y logicals. If TRUE the corresponding components of the fit (the model frame, the model matrix, the response) are returned. \beta. This argument is ignored for a flat prior. center logical or numeric. If TRUE then the covariates will be centered on their means to make them orthogonal to the intercept. This probably makes no sense for models with factors, and if the argument is numeric then it contains a vector of covariate indices to be centered (not implemented yet). prior A list containing b0 (A vector of prior coefficients) and V0 (A prior covariance matrix) sigma the population standard deviation of the errors. If FALSE then this is estimated from the residual sum of squares from the ML fit.

Details

Models for bayes.lm are specified symbolically. A typical model has the form response ~ terms where response is the (numeric) response vector and terms is a series of terms which specifies a linear predictor for response. A terms specification of the form first + second indicates all the terms in first together with all the terms in second with duplicates removed. A specification of the form first:second indicates the set of terms obtained by taking the interactions of all terms in first with all terms in second. The specification first*second indicates the cross of first and second. This is the same as first + second + first:second.

See model.matrix for some further details. The terms in the formula will be re-ordered so that main effects come first, followed by the interactions, all second-order, all third-order and so on: to avoid this pass a terms object as the formula (see aov and demo(glm.vr) for an example).

A formula has an implied intercept term. To remove this use either y ~ x - 1 or y ~ 0 + x. See formula for more details of allowed formulae.

bayes.lm calls the lower level function lm.fit to get the maximum likelihood estimates see below, for the actual numerical computations. For programming only, you may consider doing likewise.

subset is evaluated in the same way as variables in formula, that is first in data and then in the environment of formula.

Value

bayes.lm returns an object of class Bolstad. The summary function is used to obtain and print a summary of the results much like the usual summary from a linear regression using lm. The generic accessor functions coef, fitted.values and residuals extract various useful features of the value returned by bayes.lm. Note that the residuals are computed at the posterior mean values of the coefficients.

An object of class "Bolstad" from this function is a list containing at least the following components:

 coefficients a named vector of coefficients which contains the posterior mean post.var a matrix containing the posterior variance-covariance matrix of the coefficients post.sd sigma residuals the residuals, that is response minus fitted values (computed at the posterior mean) fitted.values the fitted mean values (computed at the posterior mean) df.residual the residual degrees of freedom call the matched call terms the terms object used y if requested, the response used x if requested, the model matrix used model if requested (the default), the model frame used na.action (where relevant) information returned by model.frame on the special handling of NAs

Examples

data(bears)
bears = subset(bears, Obs.No==1)
bears = bears[,-c(1,2,3,11,12)]
bears = bears[ ,c(7, 1:6)]
bears$Sex = bears$Sex - 1
log.bears = data.frame(log.Weight = log(bears\$Weight), bears[,2:7])

b0 = rep(0, 7)
V0 = diag(rep(1e6,7))

prior = list(b0 = b0, V0 = V0))
summary(fit)
print(fit)

## Dobson (1990) Page 9: Plant Weight Data:
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)

lm.D9 <- lm(weight ~ group)
bayes.D9 <- bayes.lm(weight ~ group)

summary(lm.D9)
summary(bayes.D9)