find_odd_streams {oddstream} | R Documentation |
Detect outlying series within a collection of sreaming time series
Description
This function detect outlying series within a collection of streaming time series. A sliding window is used to handle straming data. In the precence of concept drift, the forecast boundary for the system's typical behaviour can be updated periodically.
Usage
find_odd_streams(train_data, test_stream, update_threshold = TRUE,
window_length = nrow(train_data), window_skip = window_length,
concept_drift = FALSE, trials = 500, p_rate = 0.001,
cd_alpha = 0.05)
Arguments
train_data |
A multivariate time series data set that represents the typical behaviour of the system. |
test_stream |
A multivariate streaming time series data set to be tested for outliers |
update_threshold |
If TRUE, the threshold value to determine outlying series is updated. The default value is set to TRUE |
window_length |
Sliding window size (Ideally this window length should be equal to the length of the training multivariate time series data set that is used to define the outlying threshold) |
window_skip |
The number of steps the window should slide forward. The default is set to window_length |
concept_drift |
If TRUE, The outlying threshold will be updated after each window. The default is set to FALSE |
trials |
Input for |
p_rate |
False positive rate. Default value is set to 0.001. |
cd_alpha |
Singnificance level for the test of non-stationarity. |
Value
a list with components
out_marix |
The indices of the outlying series in each window |
p_value |
p-value for the two sample comparison test for concept drift detection |
anom_threshold |
anomalous threshold |
For each window a plot is also produced on the current graphic device
References
Clifton, D. A., Hugueny, S., & Tarassenko, L. (2011). Novelty detection with multivariate extreme value statistics. Journal of signal processing systems, 65 (3),371-389.
Duong, T., Goud, B. & Schauer, K. (2012) Closed-form density-based framework for automatic detection of cellular morphology changes. PNAS, 109, 8382-8387.
Talagala, P., Hyndman, R., Smith-Miles, K., Kandanaarachchi, S., & Munoz, M. (2018). Anomaly detection in streaming nonstationary temporal data (No. 4/18). Monash University, Department of Econometrics and Business Statistics.
See Also
extract_tsfeatures
, get_pc_space
, set_outlier_threshold
,
gg_featurespace
Examples
#Generate training dataset
set.seed(890)
nobs = 250
nts = 100
train_data <- ts(apply(matrix(ncol = nts, nrow = nobs), 2, function(nobs){10 + rnorm(nobs, 0, 3)}))
# Generate test stream with some outliying series
nobs = 15000
test_stream <- ts(apply(matrix(ncol = nts, nrow = nobs), 2, function(nobs){10 + rnorm(nobs, 0, 3)}))
test_stream[360:1060, 20:25] = test_stream[360:1060, 20:25] * 1.75
test_stream[2550:3550, 20:25] = test_stream[2550:3550, 20:25] * 2
find_odd_streams(train_data, test_stream , trials = 100)
# Considers the first window of the data set as the training set and the remaining as
# the test stream
train_1data <- anomalous_stream[1:100,]
test_stream <-anomalous_stream[101:1456,]
find_odd_streams(train_data, test_stream , trials = 100)