kssa {kssa}R Documentation

kssa Algorithm

Description

Run the Known Sub-Sequence Algorithm to compare the performance of imputation methods on a time series of interest

Usage

kssa(
  x_ts,
  start_methods,
  actual_methods,
  segments = 5,
  iterations = 10,
  percentmd = 0.2,
  seed = 1234
)

Arguments

x_ts

Time series object ts containing missing data (NA)

start_methods

String vector. The method or methods to start the algorithm. Same as for actual_methods

actual_methods

The imputation methods to be compared and validated. It can be a string vector containing the following You can choose between the following:

  • "all" - compare among all methods automatically - Default

  • "auto.arima" - State space representation of an ARIMA model

  • "StructTS" - State space representation of a structural model

  • "seadec" - Seasonal decomposition with Kalman smoothing

  • "linear_i" - Linear interpolation

  • "spline_i" - Spline interpolation

  • "stine_i" - Stineman interpolation

  • "simple_ma" - Simple moving average

  • "linear_ma" - Linear moving average

  • "exponential_ma" - Exponential moving average

  • "locf" - Last observation carried forward

  • "stl" - Seasonal and trend decomposition with Loess

For further details on these imputation methods please check packages imputeTS and forecast

segments

Integer. Into how many segments the time series will be divided

iterations

Integer. How many iterations to run

percentmd

Numeric. Percentage of missing data. Must match with the true percentage of missing data in x_ts

seed

Numeric. Random seed to choose

Value

A list of results to be plotted with function kssa_plot for easy interpretation

References

Benavides, I. F., Santacruz, M., Romero-Leiton, J. P., Barreto, C., & Selvaraj, J. J. (2022). Assessing methods for multiple imputation of systematic missing data in marine fisheries time series with a new validation algorithm. Aquaculture and Fisheries. Full text publication.

Examples



# Example 1: Compare all imputation methods

library("kssa")
library("imputeTS")

# Create 20% random missing data in tsAirgapComplete time series from imputeTS
airgap_na <- missMethods::delete_MCAR(as.data.frame(tsAirgapComplete), 0.2)

# Convert to time series object
airgap_na_ts <- ts(airgap_na, start = c(1959, 1), end = c(1997, 12), frequency = 12)

# Apply the kssa algorithm with 5 segments, 10 iterations, 20% of missing data,
# compare among all available methods in the package.
# Remember that percentmd must match with
# the real percentage of missing data in the input time series

results_kssa <- kssa(airgap_na_ts,
  start_methods = "all",
  actual_methods = "all",
  segments = 5,
  iterations = 10,
  percentmd = 0.2
)

# Print and check results
results_kssa

# For an easy interpretation of kssa results
# please use function kssa_plot




# Example 2: Compare only locf and linear imputation

library("kssa")
library("imputeTS")

# Create 20% random missing data in tsAirgapComplete time series from imputeTS
airgap_na <- missMethods::delete_MCAR(as.data.frame(tsAirgapComplete), 0.2)

# Convert to time series object
airgap_na_ts <- ts(airgap_na, start = c(1959, 1), end = c(1997, 12), frequency = 12)

# Apply the kssa algorithm with 5 segments, 10 iterations, 20% of missing data,
# compare among all applied methods (locf and linear interpolation).
# Remember that percentmd must match with
# the real percentage of missing data in the input time series

results_kssa <- kssa(airgap_na_ts,
  start_methods = c("locf", "linear_i"),
  actual_methods = c("locf", "linear_i"),
  segments = 5,
  iterations = 10,
  percentmd = 0.2
)

# Print and check results
results_kssa

# For an easy interpretation of kssa results
# please use function kssa_plot


[Package kssa version 0.0.1 Index]