R: Bayesian inference for multiple linear regression

bayes.lm {Bolstad}

R Documentation

Bayesian inference for multiple linear regression

Description

bayes.lm is used to fit linear models in the Bayesian paradigm. It can be used to carry out regression, single stratum analysis of variance and analysis of covariance (although these are not tested). This documentation is shamelessly adapated from the lm documentation

Usage

bayes.lm(
  formula,
  data,
  subset,
  na.action,
  model = TRUE,
  x = FALSE,
  y = FALSE,
  center = TRUE,
  prior = NULL,
  sigma = FALSE
)

Arguments

`formula`	an object of class `formula` (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’.
`data`	an optional data frame, list or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in data, the variables are taken from `environment(formula)`, typically the environment from which `bayes.lm` is called.
`subset`	an optional vector specifying a subset of observations to be used in the fitting process.
`na.action`	a function which indicates what should happen when the data contain `NA`s. The default is set by the `na.action` setting of options, and is `link[stats]{na.fail}` if that is unset. The ‘factory-fresh’ default is `na.omit`. Another possible value is `NULL`, no action. Value `na.exclude` can be useful.
`model`, `x`, `y`	logicals. If `TRUE` the corresponding components of the fit (the model frame, the model matrix, the response) are returned. `\beta`. This argument is ignored for a flat prior.
`center`	logical or numeric. If `TRUE` then the covariates will be centered on their means to make them orthogonal to the intercept. This probably makes no sense for models with factors, and if the argument is numeric then it contains a vector of covariate indices to be centered (not implemented yet).
`prior`	A list containing b0 (A vector of prior coefficients) and V0 (A prior covariance matrix)
`sigma`	the population standard deviation of the errors. If `FALSE` then this is estimated from the residual sum of squares from the ML fit.

Details

Models for bayes.lm are specified symbolically. A typical model has the form response ~ terms where response is the (numeric) response vector and terms is a series of terms which specifies a linear predictor for response. A terms specification of the form first + second indicates all the terms in first together with all the terms in second with duplicates removed. A specification of the form first:second indicates the set of terms obtained by taking the interactions of all terms in first with all terms in second. The specification first*second indicates the cross of first and second. This is the same as first + second + first:second.

See model.matrix for some further details. The terms in the formula will be re-ordered so that main effects come first, followed by the interactions, all second-order, all third-order and so on: to avoid this pass a terms object as the formula (see aov and demo(glm.vr) for an example).

A formula has an implied intercept term. To remove this use either y ~ x - 1 or y ~ 0 + x. See formula for more details of allowed formulae.

bayes.lm calls the lower level function lm.fit to get the maximum likelihood estimates see below, for the actual numerical computations. For programming only, you may consider doing likewise.

subset is evaluated in the same way as variables in formula, that is first in data and then in the environment of formula.

Value

bayes.lm returns an object of class Bolstad. The summary function is used to obtain and print a summary of the results much like the usual summary from a linear regression using lm. The generic accessor functions coef, fitted.values and residuals extract various useful features of the value returned by bayes.lm. Note that the residuals are computed at the posterior mean values of the coefficients.

An object of class "Bolstad" from this function is a list containing at least the following components:

`coefficients`	a named vector of coefficients which contains the posterior mean
`post.var`	a matrix containing the posterior variance-covariance matrix of the coefficients
`post.sd`	sigma
`residuals`	the residuals, that is response minus fitted values (computed at the posterior mean)
`fitted.values`	the fitted mean values (computed at the posterior mean)
`df.residual`	the residual degrees of freedom
`call`	the matched call
`terms`	the `terms` object used
`y`	if requested, the response used
`x`	if requested, the model matrix used
`model`	if requested (the default), the model frame used
`na.action`	(where relevant) information returned by `model.frame` on the special handling of `NA`s

Examples

data(bears)
bears = subset(bears, Obs.No==1)
bears = bears[,-c(1,2,3,11,12)]
bears = bears[ ,c(7, 1:6)]
bears$Sex = bears$Sex - 1
log.bears = data.frame(log.Weight = log(bears$Weight), bears[,2:7])

b0 = rep(0, 7)
V0 = diag(rep(1e6,7))

fit = bayes.lm(log(Weight)~Sex+Head.L+Head.W+Neck.G+Length+Chest.G, data = bears,
               prior = list(b0 = b0, V0 = V0))
summary(fit)
print(fit)


## Dobson (1990) Page 9: Plant Weight Data:
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)

lm.D9 <- lm(weight ~ group)
bayes.D9 <- bayes.lm(weight ~ group)

summary(lm.D9)
summary(bayes.D9)

[Package Bolstad version 0.2-41 Index]