setartree {setartree} | R Documentation |
Fitting SETAR-Tree models
Description
Fits a SETAR-Tree model either using a list of time series or an embedded input matrix and labels.
Usage
setartree(
data,
label = NULL,
lag = 10,
depth = 1000,
significance = 0.05,
significance_divider = 2,
error_threshold = 0.03,
stopping_criteria = "both",
mean_normalisation = FALSE,
window_normalisation = FALSE,
verbose = 2,
categorical_covariates = NULL
)
Arguments
data |
A list of time series (each list element is a separate time series) or a dataframe/matrix containing model inputs (the columns can contain past time series lags and/or external numerical/categorical covariates). |
label |
A vector of true outputs. This parameter is only required when |
lag |
The number of past time series lags that should be used when fitting the SETAR-Tree. This parameter is only required when |
depth |
Maximum tree depth. Default value is 1000. Thus, unless specify a lower value, the depth is actually controlled by the stopping criterion. |
significance |
Initial significance used by the linearity test (alpha_0). Default value is 0.05. |
significance_divider |
The corresponding significance in each tree level is divided by this value. Default value is 2. |
error_threshold |
The minimum error reduction percentage between parent and child nodes to make a split. Default value is 0.03. |
stopping_criteria |
The required stopping criteria: linearity test (lin_test), error reduction percentage (error_imp) or linearity test and error reduction percentage (both). Default value is |
mean_normalisation |
Whether each series should be normalised by deducting its mean value before building the tree. This parameter is only required when |
window_normalisation |
Whether the window-wise normalisation should be applied before building the tree. This parameter is only required when |
verbose |
Controls the level of the verbosity of SETAR-Tree: 0 (errors/warnings), 1 (limited amount of information including the current tree depth), 2 (full training information including the current tree depth and stopping criterion results in each tree node). Default value is 2. |
categorical_covariates |
Names of the categorical covariates in the input data. This parameter is only required when |
Value
An object of class setartree
which contains the following properties.
leaf_models |
Trained global pooled regression models in each leaf node. |
opt_lags |
Optimal features used to split each node. |
opt_thresholds |
Optimal threshold values used to split each node. |
lag |
The number of features used to train the SETAR-Tree. |
feature_names |
Names of the input features. |
coefficients |
Names of the coefficients of leaf node regresion models. |
num_leaves |
Number of leaf nodes in the SETAR-Tree. |
depth |
Depth of the SETAR-Tree which was determined based on the specified stopping criterion. |
leaf_instance_dis |
Number of instances used to train the regression models at each leaf node. |
stds |
The standard deviations of the residuals of each leaf node. |
categorical_covariate_values |
Information about the categorical covarites used during training (only if applicable). |
mean_normalisation |
Whether mean normalisation was applied for the training data. |
window_normalisation |
Whether window normalisation was applied for the training data. |
input_type |
Type of input data used to train the SETAR-Tree. This is |
execution_time |
Execution time of SETAR-Tree. |
Examples
# Training SETAR-Tree with a list of time series
setartree(chaotic_logistic_series)
# Training SETAR-Tree with a dataframe containing model inputs where the model inputs may contain
# past time series lags and numerical/categorical covariates
setartree(data = web_traffic_train[,-1],
label = web_traffic_train[,1],
stopping_criteria = "both",
categorical_covariates = "Project")