cpss.custom {cpss}R Documentation

Detecting changes in uers-customized models

Description

Detecting changes in uers-customized models

Usage

cpss.custom(
  dataset,
  n,
  g_subdat,
  g_param,
  g_cost,
  algorithm = "BS",
  dist_min = floor(log(n)),
  ncps_max = ceiling(n^0.4),
  pelt_pen_val = NULL,
  pelt_K = 0,
  wbs_nintervals = 500,
  criterion = "CV",
  times = 2,
  model = NULL,
  g_smry = NULL,
  easy_cost = NULL,
  param.opt = NULL
)

Arguments

dataset

an ANY object that could be a vector, matrix, tensor, list, etc.

n

an integer indicating the sample size of the data dataset.

g_subdat

a customized R function of two arguments dat and indices, which extracts a subset of data dat according to a collection of time indices indices. The returned object inherits the class from that of dataset. The argument dat inherits the class from that of dataset, and the argument indices is a logical vector with TRUEs indicating extracted indices.

g_param

a customized R function of two arguments dat (cf. dat of g\_subdat) and param.opt (cf. param.opt of cpss.custom), which returns estimated parameters based on the data segment dat. It could return a numeric value, vector, matrix, list, etc.

g_cost

a customized R function of two arguments dat (cf. dat of g\_subdat) and param, which returns a numeric value of the associated cost for data segment dat with parameters param. The argument param inherits the class from that of the returned object of g\_param.

algorithm

a character string specifying the change-point searching algorithm, one of the following choices: "SN" (segment neighborhood), "BS" (binary segmentation), "WBS" (wild binary segmentation) and "PELT" (pruned exact linear time) algorithms.

dist_min

an integer specifying minimum searching distance (length of feasible segments).

ncps_max

an integer specifying an upper bound of the number of true change-points.

pelt_pen_val

a numeric vector specifying candidate values of the penalty only if algorithm = "PELT".

pelt_K

a numeric value for pruning adjustment only if algorithm = "PELT". It is usually taken to be 0 if the negative log-likelihood is used as a cost, see Killick et al. (2012).

wbs_nintervals

an integer specifying the number of random intervals drawn only if algorithm = "WBS", see Fryzlewicz (2014).

criterion

a character string specifying the model selection criterion, "CV" ("cross-validation") or "MS" ("multiple-splitting").

times

an integer specifying how many times of sample-splitting should be performed; It should be 2 if criterion = "CV".

model

a character string indicating the considered change model.

g_smry

a customized R function of two arguments dataset (cf. dataset of cpss.custom) and param.opt (cf. param.opt of cpss.custom), which calculates the summary statistics that will be used for cost evaluation. The returned object is a list.

easy_cost

a customized R function of three arguments data_smry, s and e, which evaluates the value of the cost for a date segment form observed time point $s$ to $e$. The argument data_smry inherits the class from that of the returned object of g_smry.

param.opt

an ANY object specifying additional constant parameters needed for parameter estimation or cost evaluation beyond unknown parameters.

Value

cpss.custom returns an object of an S4 class, called "cpss", which collects data and information required for further change-point analyses and summaries.

dat

data set

mdl

considered change-point model

algo

change-point searching algorithm

algo_param_dim

user-specified upper bound of the number of true change-points if algorithm = "SN"/"BS"/"WBS", or user-specified candidate values of the penalty only if algorithm = "PELT"

SC

model selection criterion

ncps

estimated number of change-points

pelt_pen

selected value of the penalty only if algorithm = "PELT"

cps

a vector of estimated locations of change-points

params

a list object, each member is a list containing estimated parameters in the associated data segment

S_vals

a numeric vector of candidate model dimensions in terms of a sequence of numbers of change-points or values of the penalty

SC_vals

a numeric matrix, each column records the values of the criterion based on the validation data split under the corresponding model dimension (S_vals), and each row represents a splitting at each time

References

Killick, R., Fearnhead, P., and Eckley, I. A. (2012). Optimal Detection of Changepoints With a Linear Computational Cost. Journal of the American Statistical Association, 107(500): 1590–1598.

Fryzlewicz, P. (2014). Wild binary segmentation for multiple change-point detection. The Annals of Statistics, 42(6): 2243–2281.

Examples


library("cpss")
g_subdat_l1 <- function(dat, indices) {
  dat[indices]
}
g_param_l1 <- function(dat, param.opt = NULL) {
  return(median(dat))
}
g_cost_l1 <- function(dat, param) {
  return(sum(abs(dat - param)))
}
res <- cpss.custom(
  dataset = well, n = length(well),
  g_subdat = g_subdat_l1, g_param = g_param_l1, g_cost = g_cost_l1,
  ncps_max = 11
)
summary(res)
plot(well)
abline(v = res@cps, col = "red")


[Package cpss version 0.0.3 Index]