R: Fitting SETAR-Tree models

setartree {setartree}

R Documentation

Fitting SETAR-Tree models

Description

Fits a SETAR-Tree model either using a list of time series or an embedded input matrix and labels.

Usage

setartree(
  data,
  label = NULL,
  lag = 10,
  depth = 1000,
  significance = 0.05,
  significance_divider = 2,
  error_threshold = 0.03,
  stopping_criteria = "both",
  mean_normalisation = FALSE,
  window_normalisation = FALSE,
  verbose = 2,
  categorical_covariates = NULL
)

Arguments

`data`	A list of time series (each list element is a separate time series) or a dataframe/matrix containing model inputs (the columns can contain past time series lags and/or external numerical/categorical covariates).
`label`	A vector of true outputs. This parameter is only required when `data` is a dataframe/matrix containing the model inputs.
`lag`	The number of past time series lags that should be used when fitting the SETAR-Tree. This parameter is only required when `data` is a list of time series. Default value is 10.
`depth`	Maximum tree depth. Default value is 1000. Thus, unless specify a lower value, the depth is actually controlled by the stopping criterion.
`significance`	Initial significance used by the linearity test (alpha_0). Default value is 0.05.
`significance_divider`	The corresponding significance in each tree level is divided by this value. Default value is 2.
`error_threshold`	The minimum error reduction percentage between parent and child nodes to make a split. Default value is 0.03.
`stopping_criteria`	The required stopping criteria: linearity test (lin_test), error reduction percentage (error_imp) or linearity test and error reduction percentage (both). Default value is `"both"`.
`mean_normalisation`	Whether each series should be normalised by deducting its mean value before building the tree. This parameter is only required when `data` is a list of time series. Default value is FALSE.
`window_normalisation`	Whether the window-wise normalisation should be applied before building the tree. This parameter is only required when `data` is a list of time series. When this is TRUE, each row of the training embedded matrix is normalised by deducting its mean value before building the tree. Default value is FALSE.
`verbose`	Controls the level of the verbosity of SETAR-Tree: 0 (errors/warnings), 1 (limited amount of information including the current tree depth), 2 (full training information including the current tree depth and stopping criterion results in each tree node). Default value is 2.
`categorical_covariates`	Names of the categorical covariates in the input data. This parameter is only required when `data` is a dataframe/matrix and it contains categorical variables.

Value

An object of class setartree which contains the following properties.

`leaf_models`	Trained global pooled regression models in each leaf node.
`opt_lags`	Optimal features used to split each node.
`opt_thresholds`	Optimal threshold values used to split each node.
`lag`	The number of features used to train the SETAR-Tree.
`feature_names`	Names of the input features.
`coefficients`	Names of the coefficients of leaf node regresion models.
`num_leaves`	Number of leaf nodes in the SETAR-Tree.
`depth`	Depth of the SETAR-Tree which was determined based on the specified stopping criterion.
`leaf_instance_dis`	Number of instances used to train the regression models at each leaf node.
`stds`	The standard deviations of the residuals of each leaf node.
`categorical_covariate_values`	Information about the categorical covarites used during training (only if applicable).
`mean_normalisation`	Whether mean normalisation was applied for the training data.
`window_normalisation`	Whether window normalisation was applied for the training data.
`input_type`	Type of input data used to train the SETAR-Tree. This is `list` if `data` is a list of time series, and `df` if `data` is a dataframe/matrix containing model inputs.
`execution_time`	Execution time of SETAR-Tree.

Examples


# Training SETAR-Tree with a list of time series
setartree(chaotic_logistic_series)

# Training SETAR-Tree with a dataframe containing model inputs where the model inputs may contain
# past time series lags and numerical/categorical covariates
setartree(data = web_traffic_train[,-1],
          label = web_traffic_train[,1],
          stopping_criteria = "both",
          categorical_covariates = "Project")

[Package setartree version 0.2.1 Index]