R: Adjust prediction intervals for multiple comparisons

predict_adjust {api2lm}

R Documentation

Adjust prediction intervals for multiple comparisons

Description

A function to produce adjusted confidence/prediction intervals for predicted mean/new responses with a family-wise confidence level of at least level for lm objects (not applicable if no adjustment is used). Internally, the function is a slight revision of the code used in the predict.lm function.

Usage

predict_adjust(
  object,
  newdata,
  se.fit = FALSE,
  scale = NULL,
  df = Inf,
  interval = c("none", "confidence", "prediction"),
  level = 0.95,
  type = c("response", "terms"),
  method = "none",
  terms = NULL,
  na.action = stats::na.pass,
  pred.var = res.var/weights,
  weights = 1,
  ...
)

Arguments

`object`	Object of class inheriting from `"lm"`
`newdata`	An optional data frame in which to look for variables with which to predict. If omitted, the fitted values are used.
`se.fit`	A switch indicating if standard errors are required.
`scale`	Scale parameter for std.err. calculation.
`df`	Degrees of freedom for scale.
`interval`	Type of interval calculation. Can be abbreviated.
`level`	Tolerance/confidence level.
`type`	Type of prediction (response or model term). Can be abbreviated.
`method`	A character string indicating the type of adjustment to make. The default choice is `"none"`. The other available options are `"bonferroni"`, `"wh"` (Working-Hotelling), and `"scheffe"`.
`terms`	If `type = "terms"`, which terms (default is all terms), a `character` vector.
`na.action`	function determining what should be done with missing values in `newdata`. The default is to predict `NA`.
`pred.var`	the variance(s) for future observations to be assumed for prediction intervals. See ‘Details’.
`weights`	variance weights for prediction. This can be a numeric vector or a one-sided model formula. In the latter case, it is interpreted as an expression evaluated in `newdata`.
`...`	further arguments passed to or from other methods.

Details

Let a = 1 - level. All intervals are computed using the formula prediction +/- m * epesd, where m is a multiplier and epesd is the estimated standard deviation of the prediction error of the estimate.

method = "none" (no correction) produces the standard t-based confidence intervals with multiplier stats::qt(1 - a/2, df = object$df.residual).

method = "bonferroni" produces Bonferroni-adjusted intervals that use the multiplier m = stats::qt(1 - a/(2 * k), df = object$df.residual), where k is the number of intervals being produced.

The Working-Hotelling and Scheffe adjustments are distinct; the Working-Hotelling typically is related to a multiple comparisons adjustment for confidence intervals of the response mean while the Scheffe adjustment is typically related to a multiple comparisons adjustment for prediction intervals for a new response. However, references often uses these names interchangeably, so we use them equivalently in this function.

method = "wh" (Working-Hotelling) or method = "scheffe" and interval = "confidence" produces Working-Hotelling-adjusted intervals that use the multiplier m = sqrt(p * stats::qf(level, df1 = p, df2 = object$df.residual)), where p is the number of estimated coefficients in the model.

method = "wh" (Working-Hotelling) or method = "scheffe" and interval = "prediction" produces Scheffe-adjusted intervals that use the multiplier m = sqrt(k * stats::qf(level, df1 = k, df2 = object$df.residual)), where k is the number of intervals being produced.

Value

predict_adjust produces:

A vector of predictions if interval = "none".

A matrix of predictions and bounds with column names fit, lwr, and upr if interval is set. For type = "terms" this is a matrix with a column per term and may have an attribute "constant".

If se.fit is TRUE, a list with the following components is returned:

fit: vector or matrix as above
se.fit: standard error of predicted means
residual.scale: residual standard deviations
df: degrees of freedom for residual

References

Bonferroni, C. (1936). Teoria statistica delle classi e calcolo delle probabilita. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commericiali di Firenze, 8, 3-62.

Working, H., & Hotelling, H. (1929). Applications of the theory of error to the interpretation of trends. Journal of the American Statistical Association, 24(165A), 73-85. doi:10.1080/01621459.1929.10506274

Kutner, M. H., Nachtsheim, C. J., Neter, J., & Li, W. (2004). Applied Linear Statistical Models, 5th edition. New York: McGraw-Hill/Irwin.

Examples

fit <- lm(100/mpg ~ disp + hp + wt + am, data = mtcars)
newdata <- as.data.frame(rbind(
               apply(mtcars, 2, mean),
               apply(mtcars, 2, median)))
predict_adjust(fit, newdata = newdata,
               interval = "confidence",
               method = "none")
predict_adjust(fit, newdata = newdata,
               interval = "confidence",
               method = "bonferroni")
predict_adjust(fit, newdata = newdata,
               interval = "confidence",
               method = "wh")
predict_adjust(fit, newdata = newdata,
               interval = "prediction",
               method = "scheffe")

[Package api2lm version 0.2 Index]