find_odd_streams {oddstream}R Documentation

Detect outlying series within a collection of sreaming time series

Description

This function detect outlying series within a collection of streaming time series. A sliding window is used to handle straming data. In the precence of concept drift, the forecast boundary for the system's typical behaviour can be updated periodically.

Usage

find_odd_streams(train_data, test_stream, update_threshold = TRUE,
  window_length = nrow(train_data), window_skip = window_length,
  concept_drift = FALSE, trials = 500, p_rate = 0.001,
  cd_alpha = 0.05)

Arguments

train_data

A multivariate time series data set that represents the typical behaviour of the system.

test_stream

A multivariate streaming time series data set to be tested for outliers

update_threshold

If TRUE, the threshold value to determine outlying series is updated. The default value is set to TRUE

window_length

Sliding window size (Ideally this window length should be equal to the length of the training multivariate time series data set that is used to define the outlying threshold)

window_skip

The number of steps the window should slide forward. The default is set to window_length

concept_drift

If TRUE, The outlying threshold will be updated after each window. The default is set to FALSE

trials

Input for set_outlier_threshold function. Default value is set to 500.

p_rate

False positive rate. Default value is set to 0.001.

cd_alpha

Singnificance level for the test of non-stationarity.

Value

a list with components

out_marix

The indices of the outlying series in each window

p_value

p-value for the two sample comparison test for concept drift detection

anom_threshold

anomalous threshold

For each window a plot is also produced on the current graphic device

References

Clifton, D. A., Hugueny, S., & Tarassenko, L. (2011). Novelty detection with multivariate extreme value statistics. Journal of signal processing systems, 65 (3),371-389.

Duong, T., Goud, B. & Schauer, K. (2012) Closed-form density-based framework for automatic detection of cellular morphology changes. PNAS, 109, 8382-8387.

Talagala, P., Hyndman, R., Smith-Miles, K., Kandanaarachchi, S., & Munoz, M. (2018). Anomaly detection in streaming nonstationary temporal data (No. 4/18). Monash University, Department of Econometrics and Business Statistics.

See Also

extract_tsfeatures, get_pc_space, set_outlier_threshold, gg_featurespace

Examples


#Generate training dataset
set.seed(890)
nobs = 250
nts = 100
train_data <- ts(apply(matrix(ncol = nts, nrow = nobs), 2, function(nobs){10 + rnorm(nobs, 0, 3)}))
# Generate test stream with some outliying series
nobs = 15000
test_stream <- ts(apply(matrix(ncol = nts, nrow = nobs), 2, function(nobs){10 + rnorm(nobs, 0, 3)}))
test_stream[360:1060, 20:25] = test_stream[360:1060, 20:25] * 1.75
test_stream[2550:3550, 20:25] =  test_stream[2550:3550, 20:25] * 2
find_odd_streams(train_data, test_stream , trials = 100)


# Considers the first window  of the data set as the training set and the remaining as
# the test stream

train_1data <- anomalous_stream[1:100,]
test_stream <-anomalous_stream[101:1456,]
find_odd_streams(train_data, test_stream , trials = 100)



[Package oddstream version 0.5.0 Index]