R: Block-based General-to-Specific (GETS) modelling

blocksFun {gets}

R Documentation

Block-based General-to-Specific (GETS) modelling

Description

Auxiliary function (i.e. not intended for the average user) that enables block-based GETS-modelling with user-specified estimator, diagnostics and goodness-of-fit criterion.

Usage

blocksFun(y, x, untransformed.residuals=NULL, blocks=NULL,
  no.of.blocks=NULL, max.block.size=30, ratio.threshold=0.8,
  gets.of.union=TRUE, force.invertibility=FALSE,
  user.estimator=list(name="ols"), t.pval=0.001, wald.pval=t.pval,
  do.pet=FALSE, ar.LjungB=NULL, arch.LjungB=NULL, normality.JarqueB=NULL,
  user.diagnostics=NULL, gof.function=list(name="infocrit"),
  gof.method=c("min", "max"), keep=NULL, include.gum=FALSE,
  include.1cut=FALSE, include.empty=FALSE, max.paths=NULL,
  turbo=FALSE, parallel.options=NULL, tol=1e-07, LAPACK=FALSE,
  max.regs=NULL, print.searchinfo=TRUE, alarm=FALSE)

Arguments

`y`	a numeric vector (with no missing values, i.e. no non-numeric 'holes')
`x`	a `matrix`, or a `list` of matrices
`untransformed.residuals`	`NULL` (default) or, when `ols` is used with `method=6` in `user.estimator`, a numeric vector containing the untransformed residuals
`blocks`	`NULL` (default) or a `list` of lists with vectors of integers that indicate how blocks should be put together. If `NULL`, then the block composition is undertaken automatically by an internal algorithm that depends on `no.of.blocks`, `max.block.size` and `ratio.threshold`
`no.of.blocks`	`NULL` (default) or `integer`. If `NULL`, then the number of blocks is determined automatically by an internal algorithm
`max.block.size`	`integer` that controls the size of blocks
`ratio.threshold`	`numeric` between 0 and 1 that controls the minimum ratio of variables in each block to total observations
`gets.of.union`	`logical`. If `TRUE` (default), then GETS modelling is undertaken of the union of retained variables. Otherwise it is not
`force.invertibility`	`logical`. If `TRUE`, then the x-matrix is ensured to have full row-rank before it is passed on to `getsFun`
`user.estimator`	`list`, see `getsFun` for the details
`t.pval`	`numeric` value between 0 and 1. The significance level used for the two-sided coefficient significance t-tests
`wald.pval`	`numeric` value between 0 and 1. The significance level used for the Parsimonious Encompassing Tests (PETs)
`do.pet`	`logical`. If `TRUE`, then a Parsimonious Encompassing Test (PET) against the GUM is undertaken at each variable removal for the joint significance of all the deleted regressors along the current GETS path. If `FALSE`, then a PET is not undertaken at each removal
`ar.LjungB`	a two element `vector`, or `NULL`. In the former case, the first element contains the AR-order, the second element the significance level. If `NULL`, then a test for autocorrelation in the residuals is not conducted
`arch.LjungB`	a two element `vector`, or `NULL`. In the former case, the first element contains the ARCH-order, the second element the significance level. If `NULL`, then a test for ARCH in the residuals is not conducted
`normality.JarqueB`	`NULL` or a `numeric` value between 0 and 1. In the latter case, a test for non-normality in the residuals is conducted using a significance level equal to `normality.JarqueB`. If `NULL`, then no test for non-normality is conducted
`user.diagnostics`	`NULL` (default) or a `list` with two entries, `name` and `pval`. See `getsFun` for the details
`gof.function`	`list`. The first item should be named `name` and contain the name (a character) of the Goodness-of-Fit (GOF) function used. Additional items in the list `gof.function` are passed on as arguments to the GOF-function. . See `getsFun` for the details
`gof.method`	`character`. Determines whether the best Goodness-of-Fit is a minimum (default) or maximum
`keep`	`NULL` (default), `vector` of integers or a `list` of vectors of integers. In the latter case, the number of vectors should be equal to the number of matrices in `x`
`include.gum`	`logical`. If `TRUE`, then the GUM (i.e. the starting model) is included among the terminal models
`include.1cut`	`logical`. If `TRUE`, then the 1-cut model is added to the list of terminal models
`include.empty`	`logical`. If `TRUE`, then the empty model is added to the list of terminal models
`max.paths`	`NULL` (default) or `integer` greater than 0. If `NULL`, then there is no limit to the number of paths. If `integer` (e.g. 1), then this integer constitutes the maximum number of paths searched (e.g. a single path)
`turbo`	`logical`. If `TRUE`, then (parts of) paths are not searched twice (or more) unnecessarily in each GETS modelling. Setting `turbo` to `TRUE` entails a small additional computational costs, but may be outweighed substantially if estimation is slow, or if the number of variables to delete in each path is large
`parallel.options`	`NULL` or `integer` that indicates the number of cores/threads to use for parallel computing (implemented w/`makeCluster` and `parLapply`)
`tol`	`numeric` value, the tolerance for detecting linear dependencies in the columns of the variance-covariance matrix when computing the Wald-statistic used in the Parsimonious Encompassing Tests (PETs), see the `qr.solve` function
`LAPACK`	currently not used
`max.regs`	`integer`. The maximum number of regressions along a deletion path. Do not alter unless you know what you are doing!
`print.searchinfo`	`logical`. If `TRUE` (default), then a print is returned whenever simiplification along a new path is started
`alarm`	`logical`. If `TRUE`, then a sound or beep is emitted (in order to alert the user) when the model selection ends

Details

blocksFun undertakes block-based GETS modelling by a repeated but structured call to getsFun. For the details of how to user-specify an estimator via user.estimator, diagnostics via user.diagnostics and a goodness-of-fit function via gof.function, see documentation of getsFun under "Details".

The algorithm of blocksFun is similar to that of isat, but more flexible. The main use of blocksFun is the creation of user-specified methods that employs block-based GETS modelling, e.g. indicator saturation techniques.

Value

A list with the results of the block-based GETS-modelling.

Author(s)

Genaro Sucarrat, with contributions from Jonas kurle, Felix Pretis and James Reade

References

F. Pretis, J. Reade and G. Sucarrat (2018): 'Automated General-to-Specific (GETS) Regression Modeling and Indicator Saturation for Outliers and Structural Breaks'. Journal of Statistical Software 86, Number 3, pp. 1-44

G. sucarrat (2020): 'User-Specified General-to-Specific and Indicator Saturation Methods'. The R Journal 12 issue 2, pp. 388-401, https://journal.r-project.org/archive/2021/RJ-2021-024/

Examples


## more variables than observations:
y <- rnorm(20)
x <- matrix(rnorm(length(y)*40), length(y), 40)
blocksFun(y, x)

## 'x' as list of matrices:
z <- matrix(rnorm(length(y)*40), length(y), 40)
blocksFun(y, list(x,z))

## ensure regressor no. 3 in matrix no. 2 is not removed:
blocksFun(y, list(x,z), keep=list(integer(0), 3))

[Package gets version 0.38 Index]