pcm_ic {IDetect}R Documentation

Multiple change-point detection in the mean via minimising an information criterion

Description

This function performs the Isolate-Detect methodology based on an information criterion approach, in order to detect multiple change-points in the mean of a noisy data sequence, with the noise following the Gaussian distribution. More information on how this approach works as well as the relevant literature reference are given in Details.

Usage

pcm_ic(x, th_const = 0.9, Kmax = 200, penalty = c("ssic_pen", "sic_pen"),
  points = 10)

Arguments

x

A numeric vector containing the data in which you would like to find change-points.

th_const

A positive real number with default value equal to 0.9. It is used to define the threshold value that will be used at the first step of the model selection based Isolate-Detect method; see Details for more information.

Kmax

A positive integer with default value equal to 200. It is the maximum allowed number of estimated change-points in the solution path algorithm, described in Details below.

penalty

A character vector with names of the penalty functions used.

points

A positive integer with default value equal to 10. It defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively.

Details

The approach followed in pcm_ic in order to detect the change-points is based on identifying the set of change-points that minimise an information criterion. At first, we employ sol_path_pcm, which overestimates the number of change-points using th_const in order to define the threshold, and then sorts the obtained estimates in a way that the estimate, which is most likely to be correct appears first, whereas the least likely to be correct, appears last. Let J be the number of estimates that this overestimation approach returns. We will obtain a vector b = (b_1, b_2, ..., b_J), with the estimates ordered as explained above. We define the collection \left\{M_j\right\}_{j = 0,1,\ldots,J}, where M_0 is the empty set and M_j = \left\{b_1,b_2,...,b_j\right\}. Among the collection of models M_j, j=0,1,...,J, we select the one that minimises a predefined Information Criterion. The obtained set of change-points is apparently a subset of the solution path given in sol_path_pcm. More details can be found in “Detecting multiple generalized change-points by isolating single ones”, Anastasiou and Fryzlewicz (2018), preprint.

Value

A list with the following components:

sol_path A vector containing the solution path.
ic_curve A list with values of the chosen information criteria.
cpt_ic A list with the change-points detected for each information criterion considered.
no_cpt_ic The number of change-points detected for each information criterion considered.

Author(s)

Andreas Anastasiou, a.anastasiou@lse.ac.uk

See Also

ID_pcm and ID, which employ this function. In addition, see cplm_ic for the case of detecting changes in a continuous, piecewise-linear signal using the information criterion based approach.

Examples

single.cpt <- c(rep(4,1000),rep(0,1000))
single.cpt.noise <- single.cpt + rnorm(2000)
cpt.single.ic <- pcm_ic(single.cpt.noise)

three.cpt <- c(rep(4,500),rep(0,500),rep(-4,500),rep(1,500))
three.cpt.noise <- three.cpt + rnorm(2000)
cpt.three.ic <- pcm_ic(three.cpt.noise)

[Package IDetect version 0.1.0 Index]