R: Prediction intervals for future observations based on linear...

lmer_pi_futmat {predint}

R Documentation

Prediction intervals for future observations based on linear random effects models

Description

lmer_pi_futmat() calculates a bootstrap calibrated prediction interval for one or more future observation(s) based on linear random effects models. With this approach, the experimental design of the future data is taken into account (see below).

Usage

lmer_pi_futmat(
  model,
  newdat = NULL,
  futmat_list = NULL,
  alternative = "both",
  alpha = 0.05,
  nboot = 10000,
  delta_min = 0.01,
  delta_max = 10,
  tolerance = 0.001,
  traceplot = TRUE,
  n_bisec = 30,
  algorithm = "MS22"
)

Arguments

`model`	a random effects model of class `"lmerMod"`
`newdat`	either 1 or a `data.frame` with the same column names as the historical data on which `model` depends
`futmat_list`	a list that contains design matrices for each random factor
`alternative`	either "both", "upper" or "lower". `alternative` specifies if a prediction interval or an upper or a lower prediction limit should be computed
`alpha`	defines the level of confidence (1-`alpha`)
`nboot`	number of bootstraps
`delta_min`	lower start value for bisection
`delta_max`	upper start value for bisection
`tolerance`	tolerance for the coverage probability in the bisection
`traceplot`	if `TRUE`: Plot for visualization of the bisection process
`n_bisec`	maximal number of bisection steps
`algorithm`	either "MS22" or "MS22mod" (see details)

Details

This function returns bootstrap-calibrated prediction intervals as well as lower or upper prediction limits.

If algorithm is set to "MS22", both limits of the prediction interval are calibrated simultaneously using the algorithm described in Menssen and Schaarschmidt (2022), section 3.2.4. The calibrated prediction interval is given as

[l,u] = \hat{\mu} \pm q^{calib} \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c}

with \hat{\mu} as the expected future observation (historical mean) and \hat{\sigma}^2_c as the c=1, 2, ..., C variance components and \hat{\sigma}^2_{C+1} as the residual variance obtained from the random effects model fitted with lme4::lmer() and q^{calib} as the as the bootstrap-calibrated coefficient used for interval calculation.

If algorithm is set to "MS22mod", both limits of the prediction interval are calibrated independently from each other. The resulting prediction interval is given by

[l,u] = \Big[\hat{\mu} - q^{calib}_l \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c}, \quad \hat{\mu} + q^{calib}_u \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c} \Big].

Please note, that this modification does not affect the calibration procedure, if only prediction limits are of interest.

If newdat is defined, the bootstrapped future observations used for the calibration process mimic the structure of the data set provided via newdat. The data sampling is based on a list of design matrices (one for each random factor) that can be obtained if newdat and the model formula are provided to lme4::lFormula(). Hence, each random factor that is part of the initial model must have at least two replicates in newdat.
If a random factor in the future data set does not have any replicate, a list that contains design matrices (one for each random factor) can be provided via futmat_list.

This function is an implementation of the PI given in Menssen and Schaarschmidt 2022 section 3.2.4, except, that the bootstrap calibration values are drawn from bootstrap samples that mimic the future data as described above.

Value

lmer_pi_futmat() returns an object of class c("predint", "normalPI") with prediction intervals or limits in the first entry ($prediction).

References

Menssen and Schaarschmidt (2022): Prediction intervals for all of M future observations based on linear random effects models. Statistica Neerlandica, doi:10.1111/stan.12260

Examples


# loading lme4
library(lme4)

# Fitting a random effects model based on c2_dat1
fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1)
summary(fit)

#----------------------------------------------------------------------------
### Using newdat

# Prediction interval using c2_dat2 as future data
pred_int <- lmer_pi_futmat(model=fit, newdat=c2_dat2, alternative="both", nboot=100)
summary(pred_int)

# Upper prediction limit for m=1 future observations
pred_u <- lmer_pi_futmat(model=fit, newdat=1, alternative="upper", nboot=100)
summary(pred_u)

#----------------------------------------------------------------------------

### Using futmat_list

# c2_dat4 has no replication for b. Hence the list of design matrices can not be
# generated by lme4::lFormula() and have to be provided by hand via futmat_list.

c2_dat4

# Build a list containing the design matrices

fml <- vector(length=4, "list")

names(fml) <- c("a:b", "b", "a", "Residual")

fml[["a:b"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1))

fml[["b"]] <- matrix(nrow=6, ncol=1, data=c(1,1,1,1,1,1))

fml[["a"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1))

fml[["Residual"]] <- diag(6)

fml

# Please note, that the design matrix for the interaction term a:b is also
# provided even there is no replication for b, since it is assumed that
# both, the historical and the future data descent from the same data generating
# process.

# Calculate the PI
pred_fml <- lmer_pi_futmat(model=fit, futmat_list=fml, alternative="both", nboot=100)
summary(pred_fml)

#----------------------------------------------------------------------------

# Please note that nboot was set to 100 in order to decrease computing time
# of the example. For a valid analysis set nboot=10000.

[Package predint version 2.2.1 Index]