create_windows {forecastML}R Documentation

Create time-contiguous validation datasets for model evaluation

Description

Flexibly create blocks of time-contiguous validation datasets to assess the forecast accuracy of trained models at various times in the past. These validation datasets are similar to the outer loop of a nested cross-validation model training setup.

Usage

create_windows(
  lagged_df,
  window_length = 12L,
  window_start = NULL,
  window_stop = NULL,
  skip = 0,
  include_partial_window = TRUE
)

Arguments

lagged_df

An object of class 'lagged_df' or 'grouped_lagged_df' from create_lagged_df.

window_length

An integer that defines the length of the contiguous validation dataset in dataset rows/dates. If dates were given in create_lagged_df(), the validation window is 'window_length' * 'date frequency' in calendar time. Setting window_length = 0 trains the model on (a) the entire dataset or (b) between a single window_start and window_stop value. Specifying multiple window_start and window_stop values with vectors of length > 1 overrides window_length.

window_start

Optional. A row index or date identifying the row/date to start creating contiguous validation datasets. A vector of start rows/dates can be supplied for greater control. The length and order of window_start should match window_stop. If length(window_start) > 1, window_length, skip, and include_partial_window are ignored.

window_stop

Optional. An index or date identifying the row/date to stop creating contiguous validation datasets. A vector of start rows/dates can be supplied for greater control. The length and order of window_stop should match window_start. If length(window_stop) > 1, window_length, skip, and include_partial_window are ignored.

skip

An integer giving a fixed number of dataset rows/dates to skip between validation datasets. If dates were given in create_lagged_df(), the time between validation windows is skip * 'date frequency'.

include_partial_window

Boolean. If TRUE, keep validation datasets that are shorter than window_length.

Value

An S3 object of class 'windows': A data.frame giving the indices for the validation datasets.

Methods and related functions

The output of create_windows() is passed into

and has the following generic S3 methods

Examples

# Sampled Seatbelts data from the R package datasets.
data("data_seatbelts", package = "forecastML")

# Example - Training data for 2 horizon-specific models w/ common lags per feature.
horizons <- c(1, 12)
lookback <- 1:15

data_train <- create_lagged_df(data_seatbelts, type = "train", outcome_col = 1,
                               lookback = lookback, horizon = horizons)

# All historical window lengths of 12 plus any partial windows at the end of the dataset.
windows <- create_windows(data_train, window_length = 12)
windows

# Two custom validation windows with different lengths.
windows <- create_windows(data_train, window_start = c(20, 80), window_stop = c(30, 100))
windows

[Package forecastML version 0.9.0 Index]