COPPS {crossvalidationCP} | R Documentation |
Cross-validation with Order-Preserved Sample-Splitting
Description
Tuning parameters are selected by a generalised COPPS procedure. All functions use Order-Preserved Sample-Splitting, meaning that the folds will be the odd and even indexed observations. The three functions differ in which cross-validation criterion they are using. COPPS
is the original COPPS procedure Zou et al. (2020), i.e. uses quadratic error loss. CV1
and CVmod
use absolute error loss and the modified quadratic error loss, respectively.
Usage
COPPS(Y, param = 5L, estimator = leastSquares,
output = c("param", "fit", "detailed"), ...)
CV1(Y, param = 5L, estimator = leastSquares,
output = c("param", "fit", "detailed"), ...)
CVmod(Y, param = 5L, estimator = leastSquares,
output = c("param", "fit", "detailed"), ...)
Arguments
Y |
the observations, can be any data type that supports the function |
param |
a |
estimator |
a function providing a local estimate. For pre-implemented estimators see estimators. The function must have the arguments |
output |
a string specifying the output, either |
... |
additional parameters that are passed to |
Value
if output == "param"
, the selected tuning parameter, i.e. an entry from param
. If output == "fit"
, a list with the entries param
, giving the selected tuning parameter, and fit
. The named entry fit
is a list giving the returned fit obtained by applying estimator
to the whole data Y
with the selected tuning parameter. The returned value is transformed to a list with an entry cps
giving the estimated change-points and, if provided by estimator
, an entry value
giving the estimated local values. If output == "detailed"
, the same as for output == "fit"
, but additionally the entries CP
, CVodd
, and CVeven
giving the calculated cross-validation criteria for all parameter
entries. CVodd
and CVeven
are the criteria when the odd / even observations are in the test set, respectively. CP
is the sum of those two.
References
Pein, F., and Shah, R. D. (2021) Cross-validation for change-point regression: pitfalls and solutions. arXiv:2112.03220.
Zou, C., Wang, G., and Li, R. (2020) Consistent selection of the number of change-points via sample-splitting. The Annals of Statistics, 48(1), 413–439.
See Also
estimators, criteria, convertSingleParam
Examples
# call with default parameters:
# 2-folds cross-validation with ordereded folds, absolute error loss,
# least squares estimation, and possible parameters being 0 to 5 change-points
CV1(Y = rnorm(100))
# the same, but with modified error loss
CVmod(Y = rnorm(100))
# the same, but with quadratic error loss, indentical to COPPS procedure
COPPS(Y = rnorm(100))
# more interesting data and more detailed output
set.seed(1L)
Y <- c(rnorm(50), rnorm(50, 5), rnorm(50), rnorm(50, 5))
CV1(Y = Y, output = "detailed")
# finds the correct change-points at 50, 100, 150
# (plus the start and end points 0 and 200)
# list of parameters, only allowing 1 or 2 change-points
CVmod(Y = Y, param = as.list(1:2))
# COPPS potentially fails to provide a good selection when large changes occur at odd locations
# Example 1 in (Pein and Shah, 2021), see Section 2.2 in this paper for more details
set.seed(1)
exampleY <- rnorm(102, c(rep(10, 46), rep(0, 5), rep(30, 51)))
# misses one change-point
COPPS(Y = exampleY)
# correct number of change-points when modified criterion (or absolute error loss) is used
CVmod(Y = exampleY)
# PELT as a local estimator instead of least squares estimation
# param must contain parameters that are acceptable for the given estimator
CV1(Y = Y, estimator = pelt, output = "detailed", param = list("SIC", "MBIC", 3 * log(length(Y))))
# argument minseglen of pelt specified in ...
CVmod(Y = Y, estimator = pelt, output = "detailed", param = list("SIC", "MBIC", 3 * log(length(Y))),
minseglen = 30)