R: Calibrate an ensemble Bayesian Model Averaging model

calibrateEnsemble {EBMAforecast}

R Documentation

Calibrate an ensemble Bayesian Model Averaging model

Description

This function calibrates an EBMA model based on out-of-sample performance in the calibration period. Given a dependent variable and calibration-sample predictions from multiple component forecast models in the ForecastData, the calibrateEnsemble function fits an ensemble BMA mixture model. The weights assigned to each model are derived from the individual model's performance in the calibration period. Missing observations are allowed in the calibration period, however models with missing observations are penalized. When missing observations are prevalent in the calibration set, the EM algorithm or gibbs sampler is adjusted and model paprameters are estimated by maximizing a renormalized partial expected complete-data log-likelihood (Fraley et al. 2010).

Usage

calibrateEnsemble(
  .forecastData = new("ForecastData"),
  exp = 1,
  tol = sqrt(.Machine$double.eps),
  maxIter = 1e+06,
  model = "logit",
  method = "EM",
  predType = "posteriorMedian",
  useModelParams = TRUE,
  W = rep(1/dim(.forecastData@predCalibration)[2], dim(.forecastData@predCalibration)[2]),
  const = 0,
  modelPriors = rep(1, dim(.forecastData@predCalibration)[2]),
  iterations = 40000,
  burns = 20000,
  thinning = 20,
  ...
)

## S4 method for signature 'ForecastData'
calibrateEnsemble(
  .forecastData = new("ForecastData"),
  exp = 1,
  tol = sqrt(.Machine$double.eps),
  maxIter = 1e+06,
  model = "logit",
  method = "EM",
  predType = "posteriorMean",
  useModelParams = TRUE,
  W = rep(1/dim(.forecastData@predCalibration)[2], dim(.forecastData@predCalibration)[2]),
  const = 0,
  modelPriors = rep(1, dim(.forecastData@predCalibration)[2]),
  iterations = 40000,
  burns = 20000,
  thinning = 20,
  ...
)

Arguments

`.forecastData`	An object of class 'ForecastData' that will be used to calibrate the model.
`exp`	The exponential shrinkage term. Forecasts are raised to the (1/exp) power on the logit scale for the purposes of bias reduction. The default value is `exp=3`.
`tol`	Tolerance for improvements in the log-likelihood before the EM algorithm will stop optimization. The default is `tol= 0.01`, which is somewhat high. Researchers may wish to reduce this by an order of magnitude for final model estimation.
`maxIter`	The maximum number of iterations the EM algorithm will run before stopping automatically. The default is `maxIter=10000`.
`model`	The model type that should be used given the type of data that is being predicted (i.e., normal, binary, etc.).
`method`	The estimation method used. It takes either an `EM` or `gibbs` as an argument.
`predType`	The prediction type used for the gibbs sampling EBMA model, user can choose either `posteriorMedian` or `posteriorMean` (default). Model performance statistics are based on the posterior median or mean forecast. Note that the posterior median forecast is not equal to the forecast based on the median posterior weight. EM predictions based on mean.
`useModelParams`	If "TRUE" individual model predictions are transformed based on logit models. If "FALSE" all models' parameters will be set to 0 and 1.
`W`	A vector or matrix of initial model weights. If unspecified, each model will receive weight equal to 1/number of Models
`const`	User provided "wisdom of crowds" parameter, serves as minimum model weight for all models. Default = 0. Only used in model estimated using EM.
`modelPriors`	User provided vector of Dirichlet prior for each of the models. Only used in normal model estimated with gibbs sampling. Default prior is 1 for each model.
`iterations`	The number of iterations for the Bayesian model. Default = 40000.
`burns`	The burn in for the Gibbs sampler. Default = 20000.
`thinning`	How much the Gibbs sampler is thinned. Default = 20.
`...`	Not implemented

Value

Returns a data of class 'FDatFitLogit' or FDatFitNormal, a subclass of 'ForecastData', with the following slots

`predCalibration`	A matrix containing the predictions of all component models and the EBMA model for all observations in the calibration period. Under gibbs sampling, the EBMA prediction is either the median or mean of the posterior predictive distribution, depending on the `predType` setting.
`predTest`	A matrix containing the predictions of all component models and the EBMA model for all observations in the test period. Under gibbs sampling, the EBMA prediction is either the median or mean of the posterior predictive distribution, depending on the `predType` setting.
`outcomeCalibration`	A vector containing the true values of the dependent variable for all observations in the calibration period.
`outcomeTest`	An optional vector containing the true values of the dependent variable for all observations in the test period.
`modelNames`	A character vector containing the names of all component models. If no model names are specified, names will be assigned automatically.
`modelWeights`	A vector containing model weights assigned to each model. When the gibbs sampler is used, this slot contains either the median or mean of the posterior weights, depending on the `predType` setting.
`modelParams`	The parameters for the individual logit models that transform the component models.
`useModelParams`	Indicator whether model parameters for transformation were estimated or not.
`logLik`	The final log-likelihood for the calibrated EBMA model. Empty for estimations using the gibbs sampler.
`exp`	The exponential shrinkage term.
`tol`	Tolerance for improvements in the log-likelihood before the EM algorithm will stop optimization.
`maxIter`	The maximum number of iterations the EM algorithm will run before stopping automatically.
`method`	The estimation method used.
`iter`	Number of iterations run in the EM algorithm. Empty for estimations using the gibbs sampler.
`call`	The actual call used to create the object.
`posteriorWeights`	A matrix of the full posterior model weights from model calibration. Rows are the observations in the calibration period, columns are the saved iterations of the gibbs sampler. Empty for EM estimations.
`posteriorPredCalibration`	A matrix of the posterior predictive distribution for observations in the calibration period, based on the full posterior of model weights. Empty for EM estimations.
`posteriorPredTest`	A matrix of the posterior predictive distribution for observations in the test period, based on the full posterior of model weights. Empty for EM estimations.

Author(s)

Michael D. Ward <michael.d.ward@duke.edu> and Jacob M. Montgomery <jacob.montgomery@wustl.edu> and Florian M. Hollenbach <florian.hollenbach@tamu.edu>

References

Montgomery, Jacob M., Florian M. Hollenbach and Michael D. Ward. (2012). Improving Predictions Using Ensemble Bayesian Model Averaging. Political Analysis. 20: 271-291.

Raftery, A. E., T. Gneiting, F. Balabdaoui and M. Polakowski. (2005). Using Bayesian Model Averaging to calibrate forecast ensembles. Monthly Weather Review. 133:1155–1174.

Sloughter, J. M., A. E. Raftery, T. Gneiting and C. Fraley. (2007). Probabilistic quantitative precipitation forecasting using Bayesian model averaging. Monthly Weather Review. 135:3209–3220.

Fraley, C., A. E. Raftery, T. Gneiting. (2010). Calibrating Multi-Model Forecast Ensembles with Exchangeable and Missing Members using Bayesian Model Averaging. Monthly Weather Review. 138:190–202.

Sloughter, J. M., T. Gneiting and A. E. Raftery. (2010). Probabilistic wind speed forecasting using ensembles and Bayesian model averaging. Journal of the American Statistical Association. 105:25–35.

Fraley, C., A. E. Raftery, and T. Gneiting. (2010). Calibrating multimodel forecast ensembles with exchangeable and missing members using Bayesian model averaging. Monthly Weather Review. 138:190–202.

Examples

## Not run: 
data(calibrationSample)

data(testSample)

this.ForecastData <- makeForecastData(.predCalibration=calibrationSample[,c("LMER", "SAE", "GLM")],
.outcomeCalibration=calibrationSample[,"Insurgency"],.predTest=testSample[,c("LMER", "SAE", "GLM")],
.outcomeTest=testSample[,"Insurgency"], .modelNames=c("LMER", "SAE", "GLM"))
initW <- rep(1/3,3)

this.ensemble.em <- calibrateEnsemble(this.ForecastData, model="logit", tol=0.001)

this.ensemble.gibbs <- calibrateEnsemble(this.ForecastData, model="logit", method = "gibbs")

## End(Not run)

[Package EBMAforecast version 1.0.32 Index]