add_prediction {DImodelsVis}R Documentation

Add predictions and confidence interval to data using either a model object or model coefficients

Description

Add predictions and confidence interval to data using either a model object or model coefficients

Usage

add_prediction(
  data,
  model = NULL,
  coefficients = NULL,
  coeff_cols = NULL,
  vcov = NULL,
  interval = c("none", "confidence", "prediction"),
  conf.level = 0.95
)

Arguments

data

A data-frame containing appropriate values for all the terms in the model.

model

A regression model object which will be used to make predictions for the observations in 'data'. Will override 'coefficients' if specified.

coefficients

If a regression model is not available (or can't be fit in R), the regression coefficients from a model fit in some other language can be used to calculate predictions. However, the user would have to ensure there's an appropriate one-to-one positional mapping between the data columns and the coefficient values. Further, they would also have to provide a variance-covariance matrix of the coefficients in the 'vcov' parameter if they want the associated CI for the prediction or it would not be possible to calculate confidence/prediction intervals using this method.

coeff_cols

If 'coefficients' are specified and a one-to-one positional mapping between the data-columns and coefficient vector is not present. A character string or numeric index can be specified here to reorder the data columns and match the corresponding coefficient value to the respective data column. See the "Use model coefficients for prediction" section in examples.

vcov

If regression coefficients are specified, then the variance-covariance matrix of the coefficients can be specified here to calculate the associated confidence interval around each prediction. Failure to do so would result in no confidence intervals being returned. Ensure 'coefficients' and 'vcov' have the same positional mapping with the data.

interval

Type of interval to calculate:

"none" (default)

No interval to be calculated.

"confidence"

Calculate a confidence interval.

"prediction"

Calculate a prediction interval.

conf.level

The confidence level for calculating confidence/prediction intervals. Default is 0.95.

Value

A data-frame with the following additional columns

.Pred

The predicted response for each observation.

.Lower

The lower limit of the confidence/prediction interval for each observation (will be same as ".Pred" if using 'coefficients' and 'vcov' is not specified).

.Upper

The lower limit of the confidence/prediction interval for each observation (will be same as ".Pred" if using 'coefficients' and 'vcov' is not specified).

Examples

library(DImodels)
data(sim1)

# Fit a model
mod <- lm(response ~ 0 + p1 + p2 + p3 + p4 + p1:p2 + p3:p4, data = sim1)

# Create new data for adding predictions
newdata <- head(sim1[sim1$block == 1,])
print(newdata)

# Add predictions to data
add_prediction(data = newdata, model = mod)

# Adding predictions to data with confidence interval
add_prediction(data = newdata, model = mod, interval = "confidence")

# Calculate prediction intervals instead
add_prediction(data = newdata, model = mod, interval = "prediction")

# Default is a 95% interval, change to 99%
add_prediction(data = newdata, model = mod, interval = "prediction",
               conf.level = 0.99)

####################################################################
##### Use model coefficients for prediction
coeffs <- mod$coefficients

# Would now have to add columns corresponding to each coefficient in the
# data and ensure there is an appropriate mapping between data columns and
# the coefficients.
newdata$`p1:p2` = newdata$p1 * newdata$p2
newdata$`p3:p4` = newdata$p3 * newdata$p4

# If the coefficients are named then the function will try to
# perform matching between data columns and the coefficients
# Notice that confidence intervals are not produced if we don't
# specify a variance covariance matrix
add_prediction(data = newdata, coefficients = coeffs)

# However, if the coefficients are not named
# The user would have to manually specify the subset
# of data columns arranged according to the coefficients
coeffs <- unname(coeffs)

subset_data <- newdata[, c(3:6, 8,9)]
subset_data # Notice now we have the exact columns in data as in coefficients
add_prediction(data = subset_data, coefficients = coeffs)

# Or specify a selection (either by name or index) in coeff_cols
add_prediction(data = newdata, coefficients = coeffs,
               coeff_cols = c("p1", "p2", "p3", "p4", "p1:p2", "p3:p4"))

add_prediction(data = newdata, coefficients = coeffs,
               coeff_cols = c(3, 4, 5, 6, 8, 9))

# Adding confidence intervals when using model coefficients
coeffs <- mod$coefficients
# We need to provide a variance-covariance matrix to calculate the CI
# when using `coefficients` argument. The following warning will be given
add_prediction(data = newdata, coefficients = coeffs,
               interval = "confidence")

vcov_mat <- vcov(mod)
add_prediction(data = newdata, coefficients = coeffs,
               interval = "confidence", vcov = vcov_mat)

# Currently both confidence and prediction intervals will be the same when
# using this method
add_prediction(data = newdata, coefficients = coeffs,
               interval = "prediction", vcov = vcov_mat)

[Package DImodelsVis version 1.0.1 Index]