Pilliat {HDCD}R Documentation

Pilliat multiple change-point detection algorithm

Description

R wrapper function for C implementation of the multiple change-point detection algorithm by Pilliat et al. (2023), using seeded intervals generated by Algorithm 4 in Moen et al. (2023). For the sake of simplicity, detection thresholds are chosen independently of the width of the interval in which a change-point is tested for (so r=1 is set for all intervals).

Usage

Pilliat(
  X,
  threshold_d_const = 4,
  threshold_bj_const = 6,
  threshold_partial_const = 4,
  K = 2,
  alpha = 1.5,
  empirical = FALSE,
  threshold_dense = NULL,
  thresholds_partial = NULL,
  thresholds_bj = NULL,
  N = 100,
  tol = 0.01,
  rescale_variance = TRUE,
  test_all = FALSE,
  debug = FALSE
)

Arguments

X

Matrix of observations, where each row contains a time series

threshold_d_const

Leading constant for the analytical detection threshold for the dense statistic

threshold_bj_const

Leading constant for p_0 when computing the detection threshold for the Berk-Jones statistic

threshold_partial_const

Leading constant for the analytical detection threshold for the partial sum statistic

K

Parameter for generating seeded intervals

alpha

Parameter for generating seeded intervals

empirical

If TRUE, detection thresholds are based on Monte Carlo simulation using Pilliat_calibrate

threshold_dense

Manually specified value of detection threshold for the dense statistic

thresholds_partial

Vector of manually specified detection thresholds for the partial sum statistic, for sparsities/partial sums t=1,2,4,\ldots,2^{\lfloor \log_2(p)\rfloor}

thresholds_bj

Vector of manually specified detection thresholds for the Berk-Jones statistic, order corresponding to x=1,2,\ldots,x_0

N

If empirical=TRUE, N is the number of Monte Carlo samples used

tol

If empirical=TRUE, tol is the false error probability tolerance

rescale_variance

If TRUE, each row of the data is re-scaled by a MAD estimate (see rescale_variance)

test_all

If TRUE, the algorithm tests for a change-point in all candidate positions of each considered interval

debug

If TRUE, diagnostic prints are provided during execution

Value

A list containing

changepoints

vector of estimated change-points

number_of_changepoints

number of changepoints

scales

vector of estimated noise level for each series

startpoints

start point of the seeded interval detecting the corresponding change-point in changepoints

endpoints

end point of the seeded interval detecting the corresponding change-point in changepoints

References

Moen PAJ, Glad IK, Tveten M (2023). “Efficient sparsity adaptive changepoint estimation.” Arxiv preprint, 2306.04702, https://doi.org/10.48550/arXiv.2306.04702.

Pilliat E, Carpentier A, Verzelen N (2023). “Optimal multiple change-point detection for high-dimensional data.” Electronic Journal of Statistics, 17(1), 1240 – 1315.

Examples

library(HDCD)
n = 50
p = 50
set.seed(100)
# Generating data
X = matrix(rnorm(n*p), ncol = n, nrow=p)
# Adding a single sparse change-point:
X[1:5, 26:n] = X[1:5, 26:n] +2

# Vanilla Pilliat:
res = Pilliat(X)
res$changepoints

# Manually setting leading constants for detection thresholds
res = Pilliat(X, threshold_d_const = 4, threshold_bj_const = 6, threshold_partial_const=4)
res$changepoints #estimated change-point locations

# Empirical choice of thresholds:
res = Pilliat(X, empirical = TRUE, N = 100, tol = 1/100)
res$changepoints

# Manual empirical choice of thresholds (equivalent to the above)
thresholds_emp = Pilliat_calibrate(n,p, N=100, tol=1/100)
thresholds_emp$thresholds_partial # thresholds for partial sum statistic
thresholds_emp$thresholds_bj # thresholds for Berk-Jones statistic
thresholds_emp$threshold_dense # thresholds for Berk-Jones statistic
res = Pilliat(X, threshold_dense =thresholds_emp$threshold_dense, 
              thresholds_bj = thresholds_emp$thresholds_bj,
              thresholds_partial =thresholds_emp$thresholds_partial )
res$changepoints

[Package HDCD version 1.0 Index]