| predict.forecast_model {forecastML} | R Documentation |
Predict on validation datasets or forecast
Description
Predict with a 'forecast_model' object from train_model(). If data = create_lagged_df(..., type = "train"),
predictions are returned for the outer-loop nested cross-validation datasets.
If data is an object of class 'lagged_df' from create_lagged_df(..., type = "forecast"),
predictions are returned for the horizons specified in create_lagged_df(horizons = ...).
Usage
## S3 method for class 'forecast_model'
predict(..., prediction_function = list(NULL), data)
Arguments
... |
One or more trained models from |
prediction_function |
A list of user-defined prediction functions with length equal to
the number of models supplied in |
data |
If |
Value
If data = create_lagged_df(..., type = "forecast"), an S3 object of class 'training_results'. If
data = create_lagged_df(..., type = "forecast"), an S3 object of class 'forecast_results'.
Columns in returned 'training_results' data.frame:
-
model: User-supplied model name intrain_model(). -
model_forecast_horizon: The direct-forecasting time horizon that the model was trained on. -
window_length: Validation window length measured in dataset rows. -
window_number: Validation dataset number. -
valid_indices: Validation dataset row names fromattributes(create_lagged_df())$row_indices. -
date_indices: If given andmethod = "direct", validation dataset date indices fromattributes(create_lagged_df())$date_indices. If given andmethod = "multi_output", date_indices represents the date of the forecast. -
"groups": If given, the user-supplied groups increate_lagged_df(). -
"outcome_name": The target being forecasted. -
"outcome_name"_pred: The model predictions. -
"outcome_name"_pred_lower: If given, the lower prediction bounds returned by the user-supplied prediction function. -
"outcome_name"_pred_upper: If given, the upper prediction bounds returned by the user-supplied prediction function. -
forecast_indices: Ifmethod = "multi_output", the validation index of the h-step-ahead forecast. -
forecast_date_indices: Ifmethod = "multi_output", the validation date index of the h-step-ahead forecast.
Columns in returned 'forecast_results' data.frame:
-
model: User-supplied model name intrain_model(). -
model_forecast_horizon: Ifmethod = "direct", the direct-forecasting time horizon that the model was trained on. -
horizon: Forecast horizons, 1:h, measured in dataset rows. -
window_length: Validation window length measured in dataset rows. -
forecast_period: The forecast period in row indices or dates. The forecast period starts at eitherattributes(create_lagged_df())$data_stop + 1for row indices orattributes(create_lagged_df())$data_stop + 1 * frequencyfor date indices. -
"groups": If given, the user-supplied groups increate_lagged_df(). -
"outcome_name": The target being forecasted. -
"outcome_name"_pred: The model forecasts. -
"outcome_name"_pred_lower: If given, the lower forecast bounds returned by the user-supplied prediction function. -
"outcome_name"_pred_upper: If given, the upper forecast bounds returned by the user-supplied prediction function.
Examples
# Sampled Seatbelts data from the R package datasets.
data("data_seatbelts", package = "forecastML")
# Example - Training data for 2 horizon-specific models w/ common lags per predictor.
horizons <- c(1, 12)
lookback <- 1:15
data_train <- create_lagged_df(data_seatbelts, type = "train", outcome_col = 1,
lookback = lookback, horizon = horizons)
# One custom validation window at the end of the dataset.
windows <- create_windows(data_train, window_start = 181, window_stop = 192)
# User-define model - LASSO
# A user-defined wrapper function for model training that takes the following
# arguments: (1) a horizon-specific data.frame made with create_lagged_df(..., type = "train")
# (e.g., my_lagged_df$horizon_h) and, optionally, (2) any number of additional named arguments
# which are passed as '...' in train_model().
library(glmnet)
model_function <- function(data, my_outcome_col) {
x <- data[, -(my_outcome_col), drop = FALSE]
y <- data[, my_outcome_col, drop = FALSE]
x <- as.matrix(x, ncol = ncol(x))
y <- as.matrix(y, ncol = ncol(y))
model <- glmnet::cv.glmnet(x, y, nfolds = 3)
return(model)
}
# my_outcome_col = 1 is passed in ... but could have been defined in model_function().
model_results <- train_model(data_train, windows, model_name = "LASSO", model_function,
my_outcome_col = 1)
# User-defined prediction function - LASSO
# The predict() wrapper takes two positional arguments. First,
# the returned model from the user-defined modeling function (model_function() above).
# Second, a data.frame of predictors--identical to the datasets returned from
# create_lagged_df(..., type = "train"). The function can return a 1- or 3-column data.frame
# with either (a) point forecasts or (b) point forecasts plus lower and upper forecast
# bounds (column order and column names do not matter).
prediction_function <- function(model, data_features) {
x <- as.matrix(data_features, ncol = ncol(data_features))
data_pred <- data.frame("y_pred" = predict(model, x, s = "lambda.min"))
return(data_pred)
}
# Predict on the validation datasets.
data_valid <- predict(model_results, prediction_function = list(prediction_function),
data = data_train)
# Forecast.
data_forecast <- create_lagged_df(data_seatbelts, type = "forecast", outcome_col = 1,
lookback = lookback, horizon = horizons)
data_forecasts <- predict(model_results, prediction_function = list(prediction_function),
data = data_forecast)