R: Compute forecast error

return_error {forecastML}

R Documentation

Compute forecast error

Description

Compute forecast error metrics on the validation datasets or a new test dataset.

Usage

return_error(
  data_results,
  data_test = NULL,
  test_indices = NULL,
  aggregate = stats::median,
  metrics = c("mae", "mape", "mdape", "smape", "rmse", "rmsse"),
  models = NULL,
  horizons = NULL,
  windows = NULL,
  group_filter = NULL
)

Arguments

`data_results`	An object of class 'training_results' or 'forecast_results' from running (a) `predict` on a trained model or (b) `combine_forecasts()`.
`data_test`	Required for forecast results only. If `data_results` is an object of class 'forecast_results', a data.frame used to assess the accuracy of a 'forecast_results' object. `data_test` should have the outcome/target columns and any grouping columns.
`test_indices`	Required if `data_test` is given or 'rmsse' row indices or dates (class 'Date' or 'POSIXt') with length `nrow(data_test)`.
`aggregate`	Default `median`. A function–without parentheses–that aggregates historical prediction or forecast error across time series. All error metrics are first calculated at the level of the individual time series. `aggregate` is then used to combine error metrics across validation windows and horizons. Aggregations are returned at the group level if `data_results` contains groups.
`metrics`	A character vector of common forecast error metrics. The default behavior is to return all metrics.
`models`	Optional. A character vector of user-defined model names supplied to `train_model()` to filter results.
`horizons`	Optional. A numeric vector to filter results by horizon.
`windows`	Optional. A numeric vector to filter results by validation window number.
`group_filter`	Optional. A string for filtering plot results for grouped time series (e.g., `"group_col_1 == 'A'"`). `group_filter` is passed to `dplyr::filter()` internally.

Value

An S3 object of class 'validation_error', 'forecast_error', or 'forecastML_error': A list of data.frames of error metrics for the validation or forecast dataset depending on the class of data_results: 'training_results', 'forecast_results', or 'forecastML' from combine_forecasts().

A list containing:

Error metrics by model, horizon, and validation window
Error metrics by model and horizon, collapsed across validation windows
Global error metrics by model collapsed across horizons and validation windows

Error Metrics

mae: Mean absolute error (works with factor outcomes)
mape: Mean absolute percentage error
mdape: Median absolute percentage error
smape: Symmetrical mean absolute percentage error
rmse: Root mean squared error
rmsse: Root mean squared scaled error from the M5 competition

Methods and related functions

The output of return_error() has the following generic S3 methods

plot from return_error()
plot from return_error()

Examples

# Sampled Seatbelts data from the R package datasets.
data("data_seatbelts", package = "forecastML")

# Example - Training data for 2 horizon-specific models w/ common lags per predictor.
horizons <- c(1, 12)
lookback <- 1:15

data_train <- create_lagged_df(data_seatbelts, type = "train", outcome_col = 1,
                               lookback = lookback, horizon = horizons)

# One custom validation window at the end of the dataset.
windows <- create_windows(data_train, window_start = 181, window_stop = 192)

# User-define model - LASSO
# A user-defined wrapper function for model training that takes the following
# arguments: (1) a horizon-specific data.frame made with create_lagged_df(..., type = "train")
# (e.g., my_lagged_df$horizon_h) and, optionally, (2) any number of additional named arguments
# which are passed as '...' in train_model().
library(glmnet)
model_function <- function(data, my_outcome_col) {

  x <- data[, -(my_outcome_col), drop = FALSE]
  y <- data[, my_outcome_col, drop = FALSE]
  x <- as.matrix(x, ncol = ncol(x))
  y <- as.matrix(y, ncol = ncol(y))

  model <- glmnet::cv.glmnet(x, y, nfolds = 3)
  return(model)
}

# my_outcome_col = 1 is passed in ... but could have been defined in model_function().
model_results <- train_model(data_train, windows, model_name = "LASSO", model_function,
                             my_outcome_col = 1)

# User-defined prediction function - LASSO
# The predict() wrapper takes two positional arguments. First,
# the returned model from the user-defined modeling function (model_function() above).
# Second, a data.frame of predictors--identical to the datasets returned from
# create_lagged_df(..., type = "train"). The function can return a 1- or 3-column data.frame
# with either (a) point forecasts or (b) point forecasts plus lower and upper forecast
# bounds (column order and column names do not matter).
prediction_function <- function(model, data_features) {

  x <- as.matrix(data_features, ncol = ncol(data_features))

  data_pred <- data.frame("y_pred" = predict(model, x, s = "lambda.min"))
  return(data_pred)
}

# Predict on the validation datasets.
data_valid <- predict(model_results, prediction_function = list(prediction_function),
                      data = data_train)

# Forecast error metrics for validation datasets.
data_error <- return_error(data_valid)

[Package forecastML version 0.9.0 Index]