scapa.uv {anomaly}R Documentation

Detection of univariate anomalous segments using SCAPA.

Description

An offline as-if-online implementation of SCAPA (Sequential Collective And Point Anomalies) by Bardwell et al. (2019) for online collective and point anomaly detection. This version of capa.uv has a default value transform=tierney which uses sequential estimates for transforming the data prior to analysis. It also returns an S4 class which allows the results to be postprocessed at different time points as if the data had been analysed in an online fashion up to that point.

Usage

scapa.uv(
  x,
  beta = NULL,
  beta_tilde = NULL,
  type = "meanvar",
  min_seg_len = 10,
  max_seg_len = Inf,
  transform = tierney
)

Arguments

x

A numeric vector containing the data which is to be inspected.

beta

A numeric vector of length 1 or max_seg_len - min_seg_len + 1 indicating the penalty for adding additional collective anomalies of all possible lengths. If an argument of length 1 is provided the same penalty is used for all collective anomalies irrespective of their length. The default value is 4log(n), where n denotes the number of observations.

beta_tilde

A numeric constant indicating the penalty for adding an additional point anomaly. It defaults to 3log(n), where n is the number of observations.

type

A string indicating which type of deviations from the baseline are considered. Can be "meanvar" for collective anomalies characterised by joint changes in mean and variance (the default), "mean" for collective anomalies characterised by changes in mean only, or "robustmean" for collective anomalies characterised by changes in mean only which can be polluted by outliers.

min_seg_len

An integer indicating the minimum length of epidemic changes. It must be at least 2 and defaults to 10.

max_seg_len

An integer indicating the maximum length of epidemic changes. It must be at least the min_seg_len and defaults to Inf.

transform

A function used to transform the data prior to analysis by scapa.uv. This can, for example, be used to compensate for the effects of autocorrelation in the data. Importantly, the untransformed data remains available for post processing results obtained using scapa.uv. The package includes a method which can be used for the transform, (see tierney, the default), but a user defined (ideally sequential) function can be specified.

Value

An S4 class of type scapa.uv.class.

References

Fisch ATM, Eckley IA, Fearnhead P (2018). “A linear time method for the detection of point and collective anomalies.” ArXiv e-prints. https://arxiv.org/abs/1806.01947.

Fisch ATM, Bardwell L, Eckley IA (2020). “Real Time Anomaly Detection And Categorisation.”

Examples

library(anomaly)

# Simulated data example
# Generate data typically following a normal distribution with mean 0 and variance 1.
# Then introduce 3 anomaly windows and 4 point outliers.

set.seed(2018)
x  = rnorm(5000)
x[1601:1700] = rnorm(100,0,0.01)
x[3201:3300] = rnorm(100,0,10)
x[4501:4550] = rnorm(50,10,1)
x[c(1000,2000,3000,4000)] = rnorm(4,0,100)
# use magrittr to pipe the data to the transform
library(magrittr)
trans<-.%>%tierney(1000)
res<-scapa.uv(x,transform=trans)

# Plot results at two different times and note that anomalies are re-evaluated:
plot(res,epoch=3201)
plot(res,epoch=3205)



[Package anomaly version 4.0.1 Index]