R: Hall, Horowitz, and Jing (1995) "HHJ" Algorithm to Select the...

hhj {blocklength}

R Documentation

Hall, Horowitz, and Jing (1995) "HHJ" Algorithm to Select the Optimal Block-Length

Description

Perform the Hall, Horowitz, and Jing (1995) "HHJ" cross-validation algorithm to select the optimal block-length for a bootstrap on dependent data (block-bootstrap). Dependent data such as stationary time series are suitable for usage with the HHJ algorithm.

Usage

hhj(
  series,
  nb = 100L,
  n_iter = 10L,
  pilot_block_length = NULL,
  sub_sample = NULL,
  k = "two-sided",
  bofb = 1L,
  search_grid = NULL,
  grid_step = c(1L, 1L),
  cl = NULL,
  verbose = TRUE,
  plots = TRUE
)

Arguments

`series`	a numeric vector or time series giving the original data for which to find the optimal block-length for.
`nb`	an integer value, number of bootstrapped series to compute.
`n_iter`	an integer value, maximum number of iterations for the HHJ algorithm to compute.
`pilot_block_length`	a numeric value, the block-length (`l` in HHJ*) for which to perform initial block bootstraps.
`sub_sample`	a numeric value, the length of each overlapping subsample, `m` in HHJ.
`k`	a character string, either `"bias/variance"`, `"one-sided"`, or `"two-sided"` depending on the desired object of estimation. If the desired bootstrap statistic is bias or variance then select `"bias/variance"` which sets `k = 3` per HHJ. If the object of estimation is the one-sided or two-sided distribution function, then set `k = "one-sided"` or `k = "two-sided"` which sets `k = 4` and `k = 5`, respectively. For the purpose of generating symmetric confidence intervals around an unknown parameter, `k = "two-sided"` (the default) should be used.
`bofb`	a numeric value, length of the basic blocks in the block-of-blocks bootstrap, see `m =` for `tsbootstrap` and Kunsch (1989).
`search_grid`	a numeric value, the range of solutions around `l` to evaluate within the `MSE` function after* the first iteration. The first iteration will search through all the possible block-lengths unless specified in `grid_step =` .
`grid_step`	a numeric value or vector of at most length 2, the number of steps to increment over the subsample block-lengths when evaluating the `MSE` function. If `grid_step = 1` then each block-length will be evaluated in the `MSE` function. If `grid_step > 1`, the `MSE` function will search over the sequence of block-lengths from `1` to `m` by `grid_step`. If `grid_step` is a vector of length 2, the first iteration will step by the first element of `grid_step` and subsequent iterations will step by the second element.
`cl`	a cluster object, created by package parallel, doParallel, or snow. If `NULL`, no parallelization will be used.
`verbose`	a logical value, if set to `FALSE` then no interim messages are output to the console. Error messages will still be output. Default is `TRUE`.
`plots`	a logical value, if set to `FALSE` then no interim plots are output to the console. Default is `TRUE`.

Details

The HHJ algorithm is computationally intensive as it relies on a cross-validation process using a type of subsampling to estimate the mean squared error (MSE) incurred by the bootstrap at various block-lengths.

Under-the-hood, hhj() makes use of tsbootstrap, see Trapletti and Hornik (2020), to perform the moving block-bootstrap (or the block-of-blocks bootstrap by setting bofb > 1) according to Kunsch (1989).

Value

an object of class 'hhj'

References

Adrian Trapletti and Kurt Hornik (2020). tseries: Time Series Analysis and Computational Finance. R package version 0.10-48.

Kunsch, H. (1989) The Jackknife and the Bootstrap for General Stationary Observations. The Annals of Statistics, 17(3), 1217-1241. Retrieved February 16, 2021, from http://www.jstor.org/stable/2241719

Peter Hall, Joel L. Horowitz, Bing-Yi Jing, On blocking rules for the bootstrap with dependent data, Biometrika, Volume 82, Issue 3, September 1995, Pages 561-574, DOI: doi: 10.1093/biomet/82.3.561

Examples


# Generate AR(1) time series
sim <- stats::arima.sim(list(order = c(1, 0, 0), ar = 0.5),
                        n = 500, innov = rnorm(500))

# Calculate optimal block length for series
hhj(sim, sub_sample = 10)


# Use parallel computing
library(parallel)

# Make cluster object with 2 cores
cl <- makeCluster(2)

# Calculate optimal block length for series
hhj(sim, cl = cl)

[Package blocklength version 0.1.5 Index]