R: Getting the results of the SKFCPD model

SKFCPD {SKFCPD}

R Documentation

Getting the results of the SKFCPD model

Description

Estimating changepoint locations using the Dynamic Linear Model (DLM) within the Bayesian Online Changepoint Detection (BOCPD) framework. The efficient computation is achieved through implementation of the Kalman filter. The range parameter and noise-to-signal ratio are estimated from training samples via a Gaussian process model. This function is capable of handling multidimensional data with temporal correlations and random missing patterns.

Usage

  SKFCPD(design = NULL, response = NULL, FCPD = NULL, 
  init_params = list(gamma = 1, sigma_2 = 1, eta = 1), 
  train_prop = NULL, kernel_type = "matern_5_2", 
  hazard_vec=100, print_info = TRUE, truncate_at_prev_cp = FALSE)

Arguments

`design`	A vector with the length of n. The design of the experiment.
`response`	A matrix with dimension n x q. The observations.
`FCPD`	An object of the class `SKFCPD` computed in the previous run of the algorithm.
`init_params`	A list with estimated range parameter `gamma`, noise-to-signal parameter `eta` and variance parameter `sigma_2`. The default values are `gamma`=1, `eta`=1, and `sigma_2`=1.
`train_prop`	A numerical value between 0 and 1. The propotation of training samples for parameter estimation. When `train_prop`=NULL, we skip the training process and specify the parameter values in the argument `init_params`.
`kernel_type`	A character specifying the type of kernels of the input. `matern_5_2` are Matern correlation with roughness parameter 5/2. `exp` is power exponential correlation with roughness parameter alpha=2. The default choice is `matern_5_2`.
`hazard_vec`	Either a constant or a vector with the length of n. The hazard vector in the SKFCPD method. hazard_vec = 1/`hazard_const` is the prior probability that a changepoint occur at any time points. The default value of hazard_vec is 100.
`print_info`	This setting prints out updates on the progress of the algorithm if set to TRUE.
`truncate_at_prev_cp`	If TRUE, truncate the run length at the most recently detected changepoint. The default value of truncate_at_prev_cp is FALSE.

Value

SKFCPD returns a S4 object of class SKFCPD (see SKFCPD-class).

Author(s)

Hanmo Li [aut, cre], Yuedong Wang [aut], Mengyang Gu [aut]

Maintainer: Hanmo Li <hanmo@pstat.ucsb.edu>

References

Li, Hanmo, Yuedong Wang, and Mengyang Gu. Sequential Kalman filter for fast online changepoint detection in longitudinal health records. arXiv preprint arXiv:2310.18611 (2023).

Fearnhead, Paul, and Zhen Liu. On-line inference for multiple changepoint problems. Journal of the Royal Statistical Society Series B: Statistical Methodology 69, no. 4 (2007): 589-605.

Adams, Ryan Prescott, and David JC MacKay. Bayesian online changepoint detection. arXiv preprint arXiv:0710.3742 (2007).

Hartikainen, Jouni, and Simo Sarkka. Kalman filtering and smoothing solutions to temporal Gaussian process regression models. In 2010 IEEE international workshop on machine learning for signal processing, pp. 379-384. IEEE, 2010.

Examples

  library(SKFCPD)
  
  #------------------------------------------------------------------------------
  # Example: fast online changepoint detection with DEPENDENT data.
  # 
  # Data generation: Data follows a multidimensional Gaussian process with Matern 2.5 kernel.
  #------------------------------------------------------------------------------
  # Data Generation
  set.seed(1)
  
  n_obs = 150
  n_dim = 2
  seg_len = c(70, 30, 20,30)
  mean_each_seg = c(0,1,-1,0)
  
  x_mat=matrix(1:n_obs)
  y_mat=matrix(NA, nrow=n_obs, ncol=n_dim)
  
  gamma = rep(5, n_dim) # range parameter of the covariance matrix
  
  # compute the matern 2.5 kernel
  construct_cor_matrix = function(input, gamma){
    n = length(input)
    R0=abs(outer(input,(input),'-'))
    matrix_one = matrix(1, n, n)
    const = sqrt(5) * R0 / gamma
    Sigma = (matrix_one + const + const^2/3) * (exp(-const))
    return(Sigma)
  }
  
  for(j in 1:n_dim){
    y_each_dim = c()
    for(i in 1:length(seg_len)){
      nobs_per_seg = seg_len[i]
      Sigma = construct_cor_matrix(1:nobs_per_seg, gamma[j])
      L=t(chol(Sigma))
      theta=rep(mean_each_seg[i],nobs_per_seg)+L%*%rnorm(nobs_per_seg)
      y_each_dim = c(y_each_dim, theta+0.1*rnorm(nobs_per_seg))
    }
    y_mat[,j] = y_each_dim
  }
  
  ## Detect changepoints by SKFCPD
  Online_CPD_1 = SKFCPD(design = x_mat,
                        response = y_mat,
                        train_prop = 1/3)
  
  ## visulize the results
  plot_SKFCPD(Online_CPD_1)

[Package SKFCPD version 0.2.4 Index]