capa {anomaly}R Documentation

A technique for detecting anomalous segments and points based on CAPA.

Description

A technique for detecting anomalous segments and points based on CAPA (Collective And Point Anomalies) by Fisch et al. (2018). This is a generic method that can be used for both univariate and multivariate data. The specific method that is used for the analysis is deduced by capa from the dimensions of the data.

Usage

capa(
  x,
  beta = NULL,
  beta_tilde = NULL,
  type = "meanvar",
  min_seg_len = 10,
  max_seg_len = Inf,
  max_lag = 0,
  transform = robustscale
)

Arguments

x

A numeric matrix with n rows and p columns containing the data which is to be inspected.

beta

A numeric vector of length p, giving the marginal penalties. If p > 1, type ="meanvar" or type = "mean" and max_lag > 0 it defaults to the penalty regime 2' described in Fisch, Eckley and Fearnhead (2019). If p > 1, type = "mean"/"meanvar" and max_lag = 0 it defaults to the pointwise minimum of the penalty regimes 1, 2, and 3 in Fisch, Eckley and Fearnhead (2019).

beta_tilde

A numeric constant indicating the penalty for adding an additional point anomaly. It defaults to 3log(np), where n and p are the data dimensions.

type

A string indicating which type of deviations from the baseline are considered. Can be "meanvar" for collective anomalies characterised by joint changes in mean and variance (the default), "mean" for collective anomalies characterised by changes in mean only, or "robustmean" for collective anomalies characterised by changes in mean only which can be polluted by outliers.

min_seg_len

An integer indicating the minimum length of epidemic changes. It must be at least 2 and defaults to 10.

max_seg_len

An integer indicating the maximum length of epidemic changes. It must be at least min_seg_len and defaults to Inf.

max_lag

A non-negative integer indicating the maximum start or end lag. Only useful for multivariate data. Default value is 0.

transform

A function used to centre the data prior to analysis by capa. This can, for example, be used to compensate for the effects of autocorrelation in the data. Importantly, the untransformed data remains available for post processing results obtained using capa. The package includes several methods that are commonly used for the transform, (see robustscale and ac_corrected), but a user defined function can be specified. The default values is transform=robust_scale.

Value

An instance of an S4 class of type capa.class.

References

Fisch ATM, Eckley IA, Fearnhead P (2018). “A linear time method for the detection of point and collective anomalies.” ArXiv e-prints. https://arxiv.org/abs/1806.01947.

Examples

library(anomaly)
# generate some multivariate data
set.seed(0)
sim.data<-simulate(n=500,p=100,mu=2,locations=c(100,200,300),
                   duration=6,proportions=c(0.04,0.06,0.08))
res<-capa(sim.data,type="mean",min_seg_len=2,max_lag=5)
collective_anomalies(res)
plot(res)


[Package anomaly version 4.0.1 Index]