stackingWeights {MuMIn} | R Documentation |
Stacking model weights
Description
Compute model weights based on a cross-validation-like procedure.
Usage
stackingWeights(object, ..., data, R, p = 0.5)
Arguments
object , ... |
two or more fitted |
data |
a data frame containing the variables in the model, used for fitting and prediction. |
R |
the number of replicates. |
p |
the proportion of the |
Details
Each model in a set is fitted to the training data: a subset of p * N
observations in data
. From these models a prediction is produced on
the remaining part of data
(the test
or hold-out data). These hold-out predictions are fitted to the hold-out
observations, by optimising the weights by which the models are combined. This
process is repeated R
times, yielding a distribution of weights for each
model (which Smyth & Wolpert (1998) referred to as an ‘empirical Bayesian
estimate of posterior model probability’). A mean or median of model weights for
each model is taken and re-scaled to sum to one.
Value
A matrix with two rows, containing model weights
calculated using mean
and median
.
Note
This approach requires a sample size of at least 2\times
the number
of models.
Author(s)
Carsten Dormann, Kamil BartoĊ
References
Wolpert, D. H. 1992 Stacked generalization. Neural Networks 5, 241–259.
Smyth, P. and Wolpert, D. 1998 An Evaluation of Linearly Combining Density Estimators via Stacking. Technical Report No. 98–25. Information and Computer Science Department, University of California, Irvine, CA.
Dormann, C. et al. 2018 Model averaging in ecology: a review of Bayesian, information-theoretic, and tactical approaches for predictive inference. Ecological Monographs 88, 485–504.
See Also
Other model weights:
BGWeights()
,
bootWeights()
,
cos2Weights()
,
jackknifeWeights()
Examples
#simulated Cement dataset to increase sample size for the training data
fm0 <- glm(y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)
dat <- as.data.frame(apply(Cement[, -1], 2, sample, 50, replace = TRUE))
dat$y <- rnorm(nrow(dat), predict(fm0), sigma(fm0))
# global model fitted to training data:
fm <- glm(y ~ X1 + X2 + X3 + X4, data = dat, na.action = na.fail)
# generate a list of *some* subsets of the global model
models <- lapply(dredge(fm, evaluate = FALSE, fixed = "X1", m.lim = c(1, 3)), eval)
wts <- stackingWeights(models, data = dat, R = 10)
ma <- model.avg(models)
Weights(ma) <- wts["mean", ]
predict(ma)