calibrateEnsemble {EBMAforecast}R Documentation

Calibrate an ensemble Bayesian Model Averaging model

Description

This function calibrates an EBMA model based on out-of-sample performance in the calibration period. Given a dependent variable and calibration-sample predictions from multiple component forecast models in the ForecastData, the calibrateEnsemble function fits an ensemble BMA mixture model. The weights assigned to each model are derived from the individual model's performance in the calibration period. Missing observations are allowed in the calibration period, however models with missing observations are penalized. When missing observations are prevalent in the calibration set, the EM algorithm or gibbs sampler is adjusted and model paprameters are estimated by maximizing a renormalized partial expected complete-data log-likelihood (Fraley et al. 2010).

Usage

calibrateEnsemble(
  .forecastData = new("ForecastData"),
  exp = 1,
  tol = sqrt(.Machine$double.eps),
  maxIter = 1e+06,
  model = "logit",
  method = "EM",
  predType = "posteriorMedian",
  useModelParams = TRUE,
  W = rep(1/dim(.forecastData@predCalibration)[2], dim(.forecastData@predCalibration)[2]),
  const = 0,
  modelPriors = rep(1, dim(.forecastData@predCalibration)[2]),
  iterations = 40000,
  burns = 20000,
  thinning = 20,
  ...
)

## S4 method for signature 'ForecastData'
calibrateEnsemble(
  .forecastData = new("ForecastData"),
  exp = 1,
  tol = sqrt(.Machine$double.eps),
  maxIter = 1e+06,
  model = "logit",
  method = "EM",
  predType = "posteriorMean",
  useModelParams = TRUE,
  W = rep(1/dim(.forecastData@predCalibration)[2], dim(.forecastData@predCalibration)[2]),
  const = 0,
  modelPriors = rep(1, dim(.forecastData@predCalibration)[2]),
  iterations = 40000,
  burns = 20000,
  thinning = 20,
  ...
)

Arguments

.forecastData

An object of class 'ForecastData' that will be used to calibrate the model.

exp

The exponential shrinkage term. Forecasts are raised to the (1/exp) power on the logit scale for the purposes of bias reduction. The default value is exp=3.

tol

Tolerance for improvements in the log-likelihood before the EM algorithm will stop optimization. The default is tol= 0.01, which is somewhat high. Researchers may wish to reduce this by an order of magnitude for final model estimation.

maxIter

The maximum number of iterations the EM algorithm will run before stopping automatically. The default is maxIter=10000.

model

The model type that should be used given the type of data that is being predicted (i.e., normal, binary, etc.).

method

The estimation method used. It takes either an EM or gibbs as an argument.

predType

The prediction type used for the gibbs sampling EBMA model, user can choose either posteriorMedian or posteriorMean (default). Model performance statistics are based on the posterior median or mean forecast. Note that the posterior median forecast is not equal to the forecast based on the median posterior weight. EM predictions based on mean.

useModelParams

If "TRUE" individual model predictions are transformed based on logit models. If "FALSE" all models' parameters will be set to 0 and 1.

W

A vector or matrix of initial model weights. If unspecified, each model will receive weight equal to 1/number of Models

const

User provided "wisdom of crowds" parameter, serves as minimum model weight for all models. Default = 0. Only used in model estimated using EM.

modelPriors

User provided vector of Dirichlet prior for each of the models. Only used in normal model estimated with gibbs sampling. Default prior is 1 for each model.

iterations

The number of iterations for the Bayesian model. Default = 40000.

burns

The burn in for the Gibbs sampler. Default = 20000.

thinning

How much the Gibbs sampler is thinned. Default = 20.

...

Not implemented

Value

Returns a data of class 'FDatFitLogit' or FDatFitNormal, a subclass of 'ForecastData', with the following slots

predCalibration

A matrix containing the predictions of all component models and the EBMA model for all observations in the calibration period. Under gibbs sampling, the EBMA prediction is either the median or mean of the posterior predictive distribution, depending on the predType setting.

predTest

A matrix containing the predictions of all component models and the EBMA model for all observations in the test period. Under gibbs sampling, the EBMA prediction is either the median or mean of the posterior predictive distribution, depending on the predType setting.

outcomeCalibration

A vector containing the true values of the dependent variable for all observations in the calibration period.

outcomeTest

An optional vector containing the true values of the dependent variable for all observations in the test period.

modelNames

A character vector containing the names of all component models. If no model names are specified, names will be assigned automatically.

modelWeights

A vector containing model weights assigned to each model. When the gibbs sampler is used, this slot contains either the median or mean of the posterior weights, depending on the predType setting.

modelParams

The parameters for the individual logit models that transform the component models.

useModelParams

Indicator whether model parameters for transformation were estimated or not.

logLik

The final log-likelihood for the calibrated EBMA model. Empty for estimations using the gibbs sampler.

exp

The exponential shrinkage term.

tol

Tolerance for improvements in the log-likelihood before the EM algorithm will stop optimization.

maxIter

The maximum number of iterations the EM algorithm will run before stopping automatically.

method

The estimation method used.

iter

Number of iterations run in the EM algorithm. Empty for estimations using the gibbs sampler.

call

The actual call used to create the object.

posteriorWeights

A matrix of the full posterior model weights from model calibration. Rows are the observations in the calibration period, columns are the saved iterations of the gibbs sampler. Empty for EM estimations.

posteriorPredCalibration

A matrix of the posterior predictive distribution for observations in the calibration period, based on the full posterior of model weights. Empty for EM estimations.

posteriorPredTest

A matrix of the posterior predictive distribution for observations in the test period, based on the full posterior of model weights. Empty for EM estimations.

Author(s)

Michael D. Ward <michael.d.ward@duke.edu> and Jacob M. Montgomery <jacob.montgomery@wustl.edu> and Florian M. Hollenbach <florian.hollenbach@tamu.edu>

References

Montgomery, Jacob M., Florian M. Hollenbach and Michael D. Ward. (2012). Improving Predictions Using Ensemble Bayesian Model Averaging. Political Analysis. 20: 271-291.

Raftery, A. E., T. Gneiting, F. Balabdaoui and M. Polakowski. (2005). Using Bayesian Model Averaging to calibrate forecast ensembles. Monthly Weather Review. 133:1155–1174.

Sloughter, J. M., A. E. Raftery, T. Gneiting and C. Fraley. (2007). Probabilistic quantitative precipitation forecasting using Bayesian model averaging. Monthly Weather Review. 135:3209–3220.

Fraley, C., A. E. Raftery, T. Gneiting. (2010). Calibrating Multi-Model Forecast Ensembles with Exchangeable and Missing Members using Bayesian Model Averaging. Monthly Weather Review. 138:190–202.

Sloughter, J. M., T. Gneiting and A. E. Raftery. (2010). Probabilistic wind speed forecasting using ensembles and Bayesian model averaging. Journal of the American Statistical Association. 105:25–35.

Fraley, C., A. E. Raftery, and T. Gneiting. (2010). Calibrating multimodel forecast ensembles with exchangeable and missing members using Bayesian model averaging. Monthly Weather Review. 138:190–202.

Examples

## Not run: 
data(calibrationSample)

data(testSample)

this.ForecastData <- makeForecastData(.predCalibration=calibrationSample[,c("LMER", "SAE", "GLM")],
.outcomeCalibration=calibrationSample[,"Insurgency"],.predTest=testSample[,c("LMER", "SAE", "GLM")],
.outcomeTest=testSample[,"Insurgency"], .modelNames=c("LMER", "SAE", "GLM"))
initW <- rep(1/3,3)

this.ensemble.em <- calibrateEnsemble(this.ForecastData, model="logit", tol=0.001)

this.ensemble.gibbs <- calibrateEnsemble(this.ForecastData, model="logit", method = "gibbs")

## End(Not run)


[Package EBMAforecast version 1.0.32 Index]