transform_forecasts {scoringutils} | R Documentation |
Transform forecasts and observed values
Description
Function to transform forecasts and true values before scoring.
Usage
transform_forecasts(data, fun = log_shift, append = TRUE, label = "log", ...)
Arguments
data |
A data.frame or data.table with the predictions and observations.
For scoring using
For scoring integer and continuous forecasts a
For scoring predictions in a quantile-format forecast you should provide
a column called
In addition a You can check the format of your data using |
fun |
A function used to transform both true values and predictions.
The default function is |
append |
Logical, defaults to |
label |
A string for the newly created 'scale' column to denote the
newly transformed values. Only relevant if |
... |
Additional parameters to pass to the function you supplied. For
the default option of |
Details
There are a few reasons, depending on the circumstances, for why this might be desirable (check out the linked reference for more info). In epidemiology, for example, it may be useful to log-transform incidence counts before evaluating forecasts using scores such as the weighted interval score (WIS) or the continuous ranked probability score (CRPS). Log-transforming forecasts and observations changes the interpretation of the score from a measure of absolute distance between forecast and observation to a score that evaluates a forecast of the exponential growth rate. Another motivation can be to apply a variance-stabilising transformation or to standardise incidence counts by population.
Note that if you want to apply a transformation, it is important to transform the forecasts and observations and then apply the score. Applying a transformation after the score risks losing propriety of the proper scoring rule.
Value
A data.table
with either a transformed version of the data, or one
with both the untransformed and the transformed data. includes the original
data as well as a transformation of the original data. There will be one
additional column, ‘scale’, present which will be set to "natural" for the
untransformed forecasts.
Author(s)
Nikos Bosse nikosbosse@gmail.com
References
Transformation of forecasts for evaluating predictive performance in an epidemiological context Nikos I. Bosse, Sam Abbott, Anne Cori, Edwin van Leeuwen, Johannes Bracher, Sebastian Funk medRxiv 2023.01.23.23284722 doi:10.1101/2023.01.23.23284722 https://www.medrxiv.org/content/10.1101/2023.01.23.23284722v1
Examples
library(magrittr) # pipe operator
# transform forecasts using the natural logarithm
# negative values need to be handled (here by replacing them with 0)
example_quantile %>%
.[, true_value := ifelse(true_value < 0, 0, true_value)] %>%
# Here we use the default function log_shift() which is essentially the same
# as log(), but has an additional arguments (offset) that allows you add an
# offset before applying the logarithm.
transform_forecasts(append = FALSE) %>%
head()
# alternatively, integrating the truncation in the transformation function:
example_quantile %>%
transform_forecasts(
fun = function(x) {log_shift(pmax(0, x))}, append = FALSE
) %>%
head()
# specifying an offset for the log transformation removes the
# warning caused by zeros in the data
example_quantile %>%
.[, true_value := ifelse(true_value < 0, 0, true_value)] %>%
transform_forecasts(offset = 1, append = FALSE) %>%
head()
# adding square root transformed forecasts to the original ones
example_quantile %>%
.[, true_value := ifelse(true_value < 0, 0, true_value)] %>%
transform_forecasts(fun = sqrt, label = "sqrt") %>%
score() %>%
summarise_scores(by = c("model", "scale"))
# adding multiple transformations
example_quantile %>%
.[, true_value := ifelse(true_value < 0, 0, true_value)] %>%
transform_forecasts(fun = log_shift, offset = 1) %>%
transform_forecasts(fun = sqrt, label = "sqrt") %>%
head()