knn_param_search {knnp}R Documentation

Searches for the optimal values of k and d for a given time series. First, values corresponding to instants from initial + 1 to the last one are predicted. The first value predicted, which corresponds to instant initial + 1, is calculated using instants from 1 to instant initial; the second value predicted, which corresponds to instant initial + 2, is predicted using instants from 1 to instant initial + 1; and so on until last value, which corresponds to instant n (length of the given time series), is predicted using instants from 1 to instant n - 1. Finally, the error is evaluated between the predicted values and the real values of the series. This version of the optimization function uses a parallelized distances calculation function, and the computation of the predicted values is done parallelizing by the number of d's.

Description

Searches for the optimal values of k and d for a given time series. First, values corresponding to instants from initial + 1 to the last one are predicted. The first value predicted, which corresponds to instant initial + 1, is calculated using instants from 1 to instant initial; the second value predicted, which corresponds to instant initial + 2, is predicted using instants from 1 to instant initial + 1; and so on until last value, which corresponds to instant n (length of the given time series), is predicted using instants from 1 to instant n - 1. Finally, the error is evaluated between the predicted values and the real values of the series. This version of the optimization function uses a parallelized distances calculation function, and the computation of the predicted values is done parallelizing by the number of d's.

Usage

knn_param_search(
  y,
  k,
  d,
  initial = NULL,
  distance = "euclidean",
  error_measure = "MAE",
  weight = "proportional",
  v = 1,
  threads = 1
)

Arguments

y

A time series.

k

Values of k's to be analyzed.

d

Values of d's to be analyzed.

initial

Variable that determines the limit of the known past for the first instant predicted.

distance

Type of metric to evaluate the distance between points. Many metrics are supported: euclidean, manhattan, dynamic time warping, camberra and others. For more information about supported metrics check the values that 'method' argument of function parDist (from parallelDist package) can take as this is the function used to calculate the distances. Link to the package info: https://cran.r-project.org/web/packages/parallelDist Some of the values that this argument can take are "euclidean", "manhattan", "dtw", "camberra", "chord".

error_measure

Type of metric to evaluate the prediction error. Five metrics supported:

ME

Mean Error

RMSE

Root Mean Squared Error

MAE

Mean Absolute Error

MPE

Mean Percentage Error

MAPE

Mean Absolute Percentage Error

weight

Type of weight to be used at the time of calculating the predicted value with a weighted mean. Three supported: proportional , average, linear.

proportional

the weight assigned to each neighbor is inversely proportional to its distance

average

all neighbors are assigned with the same weight

linear

nearest neighbor is assigned with weight k, second closest neighbor with weight k-1, and so on until the least nearest neighbor which is assigned with a weight of 1.

v

Variable to be predicted if given multivariate time series.

threads

Number of threads to be used when parallelizing, default is 1

Value

A matrix of errors, optimal k and d. All tested ks and ks and all the used metrics.

Examples

knn_param_search(AirPassengers, 1:5, 1:3)
knn_param_search(LakeHuron, 1:10, 1:6)

[Package knnp version 2.0.0 Index]