auto_data_cleaning {tsrobprep}R Documentation

Perform automatic data cleaning of time series data

Description

Returns a matrix or a list of matrices with imputed missing values and outliers. The function automatizes the usage of functions model_missing_data, detect_outliers and impute_modelled_data. The function is designed for numerical data only.

Usage

auto_data_cleaning(
  data,
  S,
  tau = NULL,
  no.of.last.indices.to.fix = S[1],
  indices.to.fix = NULL,
  model.missing.pars = list(),
  detect.outliers.pars = list()
)

Arguments

data

an input vector, matrix or data frame of dimension nobs x nvars containing missing values; each column is a variable.

S

a number or vector describing the seasonalities (S_1, ..., S_K) in the data, e.g. c(24, 168) if the data consists of 24 observations per day and there is a weekly seasonality in the data.

tau

the quantile(s) of the missing values to be estimated in the quantile regression. Tau accepts all values in (0,1). If NULL, then the weighted lasso regression is performed.

no.of.last.indices.to.fix

a number of observations in the tail of the data to be fixed, by default set to S.

indices.to.fix

indices of the data to be fixed. If NULL, then it is calculated based on the no.of.last.indices.to.fix parameter. Otherwise, the no.of.last.indices.to.fix parameter is ignored.

model.missing.pars

named list containing additional arguments for the model_missing_data function.

detect.outliers.pars

named list containing additional arguments for the detect_outliers function.

Details

The function calls model_missing_data to clean the data from missing values, detect_outliers to detect outliers, removes them and finally applies again model_missing_data function. For details see the functions' respective help sections. ⁠ ⁠

Value

A list which contains a matrix or a list of matrices with imputed missing values or outliers, the indices of the data that were modelled, and the given quantile values.

References

Narajewski M, Kley-Holsteg J, Ziel F (2021). “tsrobprep — an R package for robust preprocessing of time series data.” SoftwareX, 16, 100809. doi: 10.1016/j.softx.2021.100809.

See Also

model_missing_data, detect_outliers, impute_modelled_data

Examples

## Not run: 
autoclean <- auto_data_cleaning(
  data = GBload[,-1], S = c(48, 7*48),
  no.of.last.indices.to.fix = dim(GBload)[1],
  model.missing.pars = list(consider.as.missing = 0, min.val = 0)
)
autoclean$replaced.indices

## End(Not run)

[Package tsrobprep version 0.3.2 Index]