R: Detection of a sustained change-point in univariate and...

changepoint {dfphase1}

R Documentation

Detection of a sustained change-point in univariate and multivariate data

Description

changepoint (univariate data) and mchangepoint (multivariate data) test for the presence of a sustained location and/or dispersion shift. Both functions can be applied to individual and subgrouped observations.

changepoint.normal.limits and mchangepoint.normal.limits precompute the corresponding control limits when the in-control distribution is normal.

Usage

changepoint(x, subset, score = c("Identity", "Ranks"), only.mean = FALSE,
  plot = TRUE, FAP = 0.05, seed = 11642257, L = 10000, limits = NA)

mchangepoint(x, subset, score = c("Identity", "Signed Ranks", "Spatial Signs",
  "Spatial Ranks", "Marginal Ranks"), only.mean = FALSE,
  plot = TRUE, FAP = 0.05, seed = 11642257, L = 10000, limits = NA) 

changepoint.normal.limits(n, m, score = c("Identity", "Ranks"),
  only.mean = FALSE, FAP = 0.05, seed = 11642257, L = 100000)

mchangepoint.normal.limits(p, n, m, score = c("Identity", "Signed Ranks", "Spatial Signs",
  "Spatial Ranks", "Marginal Ranks"), only.mean = FALSE,
  FAP = 0.05, seed = 11642257, L = 100000)

Arguments

`x`	`changepoint`: a nxm numeric matrix or a numeric vector of length m. `mchangepoint`: a pxnxm data numeric array or a pxm numeric vector. See below, for the meaning of p, n and m.
`p`	integer: number of monitored variables.
`n`	integer: size of each subgroup (number of observations gathered at each time point).
`m`	integer: number of subgroups (time points).
`subset`	an optional vector specifying a subset of subgroups/time points to be used
`score`	character: the transformation to use; see `mshewhart`.
`only.mean`	logical; if `TRUE` only a location change-point is searched.
`plot`	logical; if `TRUE`, the control statistic is displayed.
`FAP`	numeric (between 0 and 1): the desired false alarm probability.
`seed`	positive integer; if not `NA`, the RNG's state is resetted using `seed`. The current `.Random.seed` will be preserved. Unused by `mshewhart` when `limits` is not `NA`.
`L`	positive integer: the number of Monte Carlo replications used to compute the control limits. Unused by `changepoint` and `mchangepoint` when `limits` is not `NA`.
`limits`	numeric: a precomputed vector of length m containing the control limits.

Details

After an optional rank transformation (argument score), changepoint and mchangepoint compute, for \tau=2,\ldots,m, the normal likelihood ratio test statistics for verifying whether the mean and dispersion (or only the mean when only.mean=TRUE) are the same before and after \tau. See Sullivan and Woodall (1999, 2000) and Qiu (2013), Chapter 6 and Section 7.5.

Note that the control statistic is equivalent to that proposed by Lung-Yut-Fong et al. (2011) when score="Marginal Ranks" and only.mean=TRUE.

As suggested by Sullivan and Woodall (1999, 2000), control limits proportional to the in-control mean of the likelihood ratio test statistics are used. Further, when plot=TRUE, the control statistics divided by the time-varying control limits are plotted with a “pseudo-limit” equal to one.

When only.mean=FALSE, the decomposition of the likelihood ratio test statistic suggested by Sullivan and Woodall (1999, 2000) for diagnostic purposes is also computed, and optionally plotted.

Value

changepoint and mchangepoint return an invisible list with elements

`glr`	control statistics.
`mean`, `dispersion`	decomposition of the control statistics in the two parts due to changes in the mean and dispersion, respectively. These elements are present only when `only.mean=FALSE`.
`limits`	control limits.
`score`, `only.mean`, `FAP`, `L`, `seed`	input arguments.

changepoint.normal.limits and mchangepoint.normal.limits return a numeric vector containing the control limits.

Note

When limits is NA, changepoint and mchangepoint compute the control limits by permutation. The resulting control charts are distribution-free.
Pre-computed limits, like those computed using changepoint.normal.limits and mchangepoint.normal.limits, are recommended only for univariate data when score=Ranks. Indeed, in all the other cases, the resulting control chart will not be distribution-free.
However, note that, when score is Signed Ranks, Spatial Signs, Spatial Ranks the normal-based control limits are distribution-free in the class of all multivariate elliptical distributions.

Author(s)

Giovanna Capizzi and Guido Masarotto.

References

A. Lung-Yut-Fong, C. Lévy-Leduc, O. Cappé O (2011) “Homogeneity and change-point detection tests for multivariate data using rank statistics”. arXiv:11071971, https://arxiv.org/abs/1107.1971.

P. Qiu (2013) Introduction to Statistical Process Control. Chapman & Hall/CRC Press.

J. H. Sullivan, W. H. Woodall (1996) “A control chart for preliminary analysis of individual observations”. Journal of Quality Technology, 28, pp. 265–278, doi:10.1080/00224065.1996.11979677.

J. H. Sullivan, W. H. Woodall (2000) “Change-point detection of mean vector or covariance matrix shifts using multivariate individual observations”. IIE Transactions, 32, pp. 537–549 doi:10.1080/07408170008963929.

Examples

data(gravel)
changepoint(gravel[1,,])
mchangepoint(gravel)
mchangepoint(gravel,score="Signed Ranks")

[Package dfphase1 version 1.2.0 Index]