pcm_ic {IDetect} | R Documentation |
Multiple change-point detection in the mean via minimising an information criterion
Description
This function performs the Isolate-Detect methodology based on an information criterion approach, in order to detect multiple change-points in the mean of a noisy data sequence, with the noise following the Gaussian distribution. More information on how this approach works as well as the relevant literature reference are given in Details.
Usage
pcm_ic(x, th_const = 0.9, Kmax = 200, penalty = c("ssic_pen", "sic_pen"),
points = 10)
Arguments
x |
A numeric vector containing the data in which you would like to find change-points. |
th_const |
A positive real number with default value equal to 0.9. It is used to define the threshold value that will be used at the first step of the model selection based Isolate-Detect method; see Details for more information. |
Kmax |
A positive integer with default value equal to 200. It is the maximum allowed number of estimated change-points in the solution path algorithm, described in Details below. |
penalty |
A character vector with names of the penalty functions used. |
points |
A positive integer with default value equal to 10. It defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively. |
Details
The approach followed in pcm_ic
in order to detect
the change-points is based on identifying the set of change-points that
minimise an information criterion. At first, we employ sol_path_pcm
,
which overestimates the number of change-points using th_const
in order to define the
threshold, and then sorts the obtained estimates in a way that the estimate, which
is most likely to be correct appears first, whereas the least likely to
be correct, appears last. Let J
be the number of estimates
that this overestimation approach returns. We will obtain a vector
b = (b_1, b_2, ..., b_J)
, with the estimates ordered as explained above. We define
the collection \left\{M_j\right\}_{j = 0,1,\ldots,J}
, where M_0
is the empty set
and M_j = \left\{b_1,b_2,...,b_j\right\}
. Among the collection of models
M_j, j=0,1,...,J
, we select the one that minimises a predefined Information
Criterion. The obtained set of change-points is apparently a subset of the solution path
given in sol_path_pcm
. More details can be found in
“Detecting multiple generalized change-points by isolating single ones”,
Anastasiou and Fryzlewicz (2018), preprint.
Value
A list with the following components:
sol_path | A vector containing the solution path. |
ic_curve | A list with values of the chosen information criteria. |
cpt_ic | A list with the change-points detected for each information criterion considered. |
no_cpt_ic | The number of change-points detected for each information criterion considered. |
Author(s)
Andreas Anastasiou, a.anastasiou@lse.ac.uk
See Also
ID_pcm
and ID
, which employ this function.
In addition, see cplm_ic
for the case of detecting changes in
a continuous, piecewise-linear signal using the information criterion based approach.
Examples
single.cpt <- c(rep(4,1000),rep(0,1000))
single.cpt.noise <- single.cpt + rnorm(2000)
cpt.single.ic <- pcm_ic(single.cpt.noise)
three.cpt <- c(rep(4,500),rep(0,500),rep(-4,500),rep(1,500))
three.cpt.noise <- three.cpt + rnorm(2000)
cpt.three.ic <- pcm_ic(three.cpt.noise)