auto_data_cleaning {tsrobprep} | R Documentation |
Perform automatic data cleaning of time series data
Description
Returns a matrix or a list of matrices with imputed missing values and outliers. The function automatizes the usage of functions model_missing_data, detect_outliers and impute_modelled_data. The function is designed for numerical data only.
Usage
auto_data_cleaning(
data,
S,
tau = NULL,
no.of.last.indices.to.fix = S[1],
indices.to.fix = NULL,
model.missing.pars = list(),
detect.outliers.pars = list()
)
Arguments
data |
an input vector, matrix or data frame of dimension nobs x nvars containing missing values; each column is a variable. |
S |
a number or vector describing the seasonalities (S_1, ..., S_K) in the data, e.g. c(24, 168) if the data consists of 24 observations per day and there is a weekly seasonality in the data. |
tau |
the quantile(s) of the missing values to be estimated in the quantile regression. Tau accepts all values in (0,1). If NULL, then the weighted lasso regression is performed. |
no.of.last.indices.to.fix |
a number of observations in the tail of the data to be fixed, by default set to S. |
indices.to.fix |
indices of the data to be fixed. If NULL, then it is calculated based on the no.of.last.indices.to.fix parameter. Otherwise, the no.of.last.indices.to.fix parameter is ignored. |
model.missing.pars |
named list containing additional arguments for the model_missing_data function. |
detect.outliers.pars |
named list containing additional arguments for the detect_outliers function. |
Details
The function calls model_missing_data to clean
the data from missing values, detect_outliers to detect
outliers, removes them and finally applies again
model_missing_data function. For details see the
functions' respective help sections.
Value
A list which contains a matrix or a list of matrices with imputed missing values or outliers, the indices of the data that were modelled, and the given quantile values.
References
Narajewski M, Kley-Holsteg J, Ziel F (2021). “tsrobprep — an R package for robust preprocessing of time series data.” SoftwareX, 16, 100809. doi: 10.1016/j.softx.2021.100809.
See Also
model_missing_data
,
detect_outliers, impute_modelled_data
Examples
## Not run:
autoclean <- auto_data_cleaning(
data = GBload[,-1], S = c(48, 7*48),
no.of.last.indices.to.fix = dim(GBload)[1],
model.missing.pars = list(consider.as.missing = 0, min.val = 0)
)
autoclean$replaced.indices
## End(Not run)