nsp_selfnorm {nsp}R Documentation

Self-normalised Narrowest Significance Pursuit algorithm with general covariates and user-specified threshold

Description

This function runs the self-normalised Narrowest Significance Pursuit (NSP) algorithm on data sequence y and design matrix x to obtain localised regions (intervals) of the domain in which the parameters of the linear regression model y_t = beta(t) x_t + z_t significantly depart from constancy (e.g. by containing change-points). For any interval considered by the algorithm, significant departure from parameter constancy is achieved if the self-normalised multiscale deviation measure (see Details for the literature reference) exceeds lambda. This function is used by the higher-level function nsp_poly_selfnorm (which estimates a suitable lambda so that a given global significance level is guaranteed), and human users may prefer to use that function if x describe polynomial covariates; however, nsp_selfnorm can also be run directly, if desired. The function assumes independence, symmetry and finite variance of the errors z_t, but little else; in particular they do not need to have a constant variance across t.

Usage

nsp_selfnorm(
  y,
  x,
  M,
  lambda,
  power = 1/2,
  min.size = 20,
  eps = 0.03,
  c = exp(1 + 2 * eps),
  overlap = FALSE
)

Arguments

y

A vector containing the data sequence being the response in the linear model y_t = beta(t) x_t + z_t.

x

The design matrix in the regression model above, with the regressors as columns.

M

The minimum number of intervals considered at each recursive stage, unless the number of all intervals is smaller, in which case all intervals are used.

lambda

The threshold parameter for measuring the significance of non-constancy (of the linear regression parameters), for use with the self-normalised multiscale supremum-type deviation measure described in the paper.

power

A parameter for the (rough) estimator of the global sum of squares of z_t; the span of the moving window in that estimator is min(n, max(round(n^power), min.size)), where n is the length of y.

min.size

(See immediately above.)

eps

Parameter of the self-normalisation statistic as described in the paper; use default if unsure how to set.

c

Parameter of the self-normalisation statistic as described in the paper; use default if unsure how to set.

overlap

If FALSE, then on discovering a significant interval, the search continues recursively to the left and to the right of that interval. If TRUE, then the search continues to the left and to the right of the midpoint of that interval.

Details

The NSP algorithm is described in P. Fryzlewicz (2021) "Narrowest Significance Pursuit: inference for multiple change-points in linear models", preprint.

Value

A list with the following components:

intervals

A data frame containing the estimated intervals of significance: starts and ends is where the intervals start and end, respectively; values are the values of the deviation measure on each given interval; midpoints are their midpoints.

threshold.used

The threshold lambda.

Author(s)

Piotr Fryzlewicz, p.fryzlewicz@lse.ac.uk

See Also

nsp_poly, nsp_poly, nsp_poly_ar, nsp_tvreg, nsp_poly_selfnorm

Examples

set.seed(1)
g <- c(rep(0, 100), rep(10, 100), rep(0, 100))
x.g <- g + stats::rnorm(300) * seq(from = 1, to = 4, length = 300)
wn003 <- sim_max_holder(100, 500, .03)
lambda <- as.numeric(stats::quantile(wn003, .9))
nsp_selfnorm(x.g, matrix(1, 300, 1), 100, lambda)

[Package nsp version 1.0.0 Index]