EnBinSeg {eNchange}R Documentation

An S4 method to detect the change-points in an irregularly spaced time series using Ensemble Binary Segmentation.

Description

An S4 method to detect the change-points in an irregularly spaced time series using the Ensemble Binary Segmentation methodology described in Korkas (2020).

Usage

EnBinSeg(
  H,
  thresh = "universal",
  q = 0.99,
  p = 1,
  start.values = c(0.9, 0.6),
  dampen.factor = "auto",
  epsilon = 1e-05,
  LOG = TRUE,
  process = "acd",
  thresh2 = 0.05,
  num_ens = 500,
  min_dist = 0.005,
  pp = 1,
  do.parallel = 2,
  b = NULL,
  acd_p = 0,
  acd_q = 1
)

## S4 method for signature 'ANY'
EnBinSeg(
  H,
  thresh = "universal",
  q = 0.99,
  p = 1,
  start.values = c(0.9, 0.6),
  dampen.factor = "auto",
  epsilon = 1e-05,
  LOG = TRUE,
  process = "acd",
  thresh2 = 0.05,
  num_ens = 500,
  min_dist = 0.005,
  pp = 1,
  do.parallel = 2,
  b = NULL,
  acd_p = 0,
  acd_q = 1
)

Arguments

H

The input irregular time series.

thresh

The threshold parameter which acts as a stopping rule to detect further change-points and has the form C log(sample). If "universal" then C is data-independent and preselected using the approach described in Korkas (2020). If "boot" it uses the data-dependent method boot_thresh. Default is "universal".

q

The universal threshold simulation quantile or the bootstrap distribution quantile. Default is 0.99.

p

The support of the CUSUM statistic. Default is 1.

start.values

Warm starts for the optimizers of the likelihood functions.

dampen.factor

The dampen factor in the denominator of the residual process. Default is "auto".

epsilon

A parameter added to ensure the boundness of the residual process. Default is 1e-5.

LOG

Take the log of the residual process. Default is TRUE.

process

Choose between "acd" or "hawkes" or "additive" (signal +iid noise). Default is "acd".

thresh2

Keep only the change-points that appear more than thresh2 M times.

num_ens

Number of ensembles denoted by M in the paper. Default is 500.

min_dist

The minimum distance as percentage of sample size to use in the post-processing. Default is 0.005.

pp

Post-process the change-points based on the distance from the highest ranked change-points.

do.parallel

Choose the number of cores for parallel computation. If 0 no parallelism is done. Default is 2.

b

A parameter to control how close the random end points are to the start points. A large value will on average return shorter random intervals. If NULL all points have an equal chance to be selected (uniformly distributed). Default is NULL.

acd_p

The p order of the ACD model. Default is 0.

acd_q

The q order of the ACD model. Default is 1.

Value

Returns a list with the detected change-points and the frequency table of the ensembles across M applications.

References

Korkas Karolos. "Ensemble Binary Segmentation for irregularly spaced data with change-points" Preprint <arXiv:2003.03649>.

Examples

pw.acd.obj <- new("simACD")
pw.acd.obj@cp.loc <- seq(0.1,0.95,by=0.025)
pw.acd.obj@lambda_0 <- rep(c(0.5,2),1+length(pw.acd.obj@cp.loc)/2)
pw.acd.obj@alpha <- rep(0.2,1+length(pw.acd.obj@cp.loc))
pw.acd.obj@beta <- rep(0.4,1+length(pw.acd.obj@cp.loc))
pw.acd.obj@N <- 5000
pw.acd.obj <- pc_acdsim(pw.acd.obj)
ts.plot(pw.acd.obj@x,main="Ensemble BS");abline(v=EnBinSeg(pw.acd.obj@x)[[1]],col="red")
#real change-points in grey
abline(v=floor(pw.acd.obj@cp.loc*pw.acd.obj@N),col="grey",lty=2) 
ts.plot(pw.acd.obj@x,main="Standard BS");abline(v=BinSeg(pw.acd.obj@x)[[1]],col="blue")
#real change-points in grey
abline(v=floor(pw.acd.obj@cp.loc*pw.acd.obj@N),col="grey",lty=2)

[Package eNchange version 1.0 Index]