knn.forecast.boot.intervals {knnwtsim} | R Documentation |
KNN Forecast Bootstrap Prediction Intervals
Description
A function for forecasting using KNN regression with prediction intervals. The approach is based on the description of
"Prediction intervals from bootstrapped residuals" from chapter 5.5 of Hyndman R, Athanasopoulos G (2021) https://otexts.com/fpp3/prediction-intervals.html#prediction-intervals-from-bootstrapped-residuals,
modified as needed for use with KNN regression. The algorithm starts by calculating a pool of forecast errors to later
sample from. If there are n
points prior to the first observation indicated in f.index.in
then there will be n - k.in
errors generated by one-step ahead forecasts
starting with the point of the response series at the index k.in + 1
. The first k.in
points cannot be estimated because
a minimum of k.in
eligible neighbors would be needed. The optional burn.in
argument can be used to increase the number
of points from the start of the series that need to be available as neighbors before calculating errors for the pool. Next, B
possible paths the series could take are simulated using the pool of errors. Each path is simulated by calling knn.forecast()
, estimating the first point in f.index.in
, adding a sampled forecast error, then adding
this value to the end of the series. This process is then repeated for the next point in f.index.in
until all have been estimated. The final output
interval estimates are calculated for each point in f.index.in
by taking the appropriate percentiles of the corresponding simulations of that point.
The mean and medians are also calculated from these simulations. One important implication of this behavior is that the mean forecast output from this function can
differ from the point forecast produced by knn.forecast()
alone.
Usage
knn.forecast.boot.intervals(
Sim.Mat.in,
f.index.in,
k.in,
y.in,
burn.in = NULL,
B = 200,
return.simulations = FALSE,
level = 0.95
)
Arguments
Sim.Mat.in |
numeric and symmetric matrix of similarities (recommend use of |
f.index.in |
numeric vector indicating the indices of |
k.in |
integer value indicating the the number of nearest neighbors to be considered in forecasting, must be |
y.in |
numeric vector of the response series to be forecast. |
burn.in |
integer value which indicates how many points at the start of the series to set aside as eligible neighbors before calculating forecast errors to be re-sampled. |
B |
integer value representing the number of bootstrap replications, this will be the number of forecasts simulated and used to calculate outputs, must be |
return.simulations |
logical value indicating whether to return all simulated forecasts. |
level |
numeric value over the range (0,1) indicating the confidence level for the prediction intervals. |
Value
list of the following components:
- lb
numeric vector of the same length as
f.index.in
, with the estimated lower bound of the prediction interval.- ub
numeric vector of the same length as
f.index.in
, with the estimated upper bound of the prediction interval.- mean
numeric vector of the same length as
f.index.in
, with the mean of theB
simulated paths for each forecasted point.- median
numeric vector of the same length as
f.index.in
, with the median of theB
simulated paths for each forecasted point.- simulated.paths
numeric matrix where each of the
B
rows contains a simulated path for the points inf.index.in
, only returned ifreturn.simulations = TRUE
.
See Also
-
knn.forecast()
for the function called to perform knn regression. -
SwMatrixCalc()
for the function to calculate a matrix with the recommended similarity measure. Hyndman R, Athanasopoulos G (2021),"Forecasting: Principles and Practice, 3rd ed", Chapter 5.5, https://otexts.com/fpp3/prediction-intervals.html#prediction-intervals-from-bootstrapped-residuals. For background on the algorithm this function is based on.
Examples
data("simulation_master_list")
series.index <- 15
ex.series <- simulation_master_list[[series.index]]$series.lin.coef.chng.x
# Weights pre tuned by random search. In alpha, beta, gamma order
pre.tuned.wts <- c(0.2148058, 0.2899638, 0.4952303)
pre.tuned.k <- 5
df <- data.frame(ex.series)
# Generate vector of time orders
df$t <- c(1:nrow(df))
# Generate vector of periods
nperiods <- simulation_master_list[[series.index]]$seasonal.periods
df$p <- rep(1:nperiods, length.out = nrow(df))
# Pull corresponding exogenous predictor(s)
X <- as.matrix(simulation_master_list[[series.index]]$x.chng)
# Calculate the weighted similarity matrix using Sw
Sw.ex <- SwMatrixCalc(
t.in = df$t,
p.in = df$p, nPeriods.in = nperiods,
X.in = X,
weights = pre.tuned.wts
)
n <- length(ex.series)
# Index we want to forecast
f.index <- c((n - 5 + 1):length(ex.series))
interval.forecast <- knn.forecast.boot.intervals(
Sim.Mat.in = Sw.ex,
f.index.in = f.index,
y.in = ex.series,
k.in = pre.tuned.k
)