R: Information matrix test under the K-gaps model

kgaps_imt {exdex}

R Documentation

Information matrix test under the `K`-gaps model

Description

Performs the information matrix test (IMT) of Suveges and Davison (2010) to diagnose misspecification of the K-gaps model.

Usage

kgaps_imt(data, u, k = 1, inc_cens = TRUE)

Arguments

data

A numeric vector or numeric matrix of raw data. If data is a matrix then the log-likelihood is constructed as the sum of (independent) contributions from different columns. A common situation is where each column relates to a different year.

If data contains missing values then split_by_NAs is used to divide the data into sequences of non-missing values.

u, k

Numeric vectors. u is a vector of extreme value thresholds applied to data. k is a vector of values of the run parameter K, as defined in Suveges and Davison (2010). See kgaps for more details.

Any values in u that are greater than all the observations in data will be removed without a warning being given.

inc_cens

A logical scalar indicating whether or not to include contributions from censored inter-exceedance times, relating to the first and last observations. See Attalides (2015) for details.

Details

The K-gaps IMT is performed a over grid of all combinations of threshold and K in the vectors u and k. If the estimate of \theta is 0 then the IMT statistic, and its associated p-value is NA.

For details of the IMT see Suveges and Davison (2010). There are some typing errors on pages 18-19 that have been corrected in producing the code: the penultimate term inside {...} in the middle equation on page 18 should be (c_j(K))^2, as should the penultimate term in the first equation on page 19; the {...} bracket should be squared in the 4th equation on page 19; the factor n should be N-1 in the final equation on page 19.

Value

An object (a list) of class c("kgaps_imt", "exdex") containing

`imt`	A `length(u)` by `length(k)` numeric matrix. Column i contains, for `K` = `k[i]`, the values of the information matrix test statistic for the set of thresholds in `u`. The column names are the values in `k`. The row names are the approximate empirical percentage quantile levels of the thresholds in `u`.
`p`	A `length(u)` by `length(k)` numeric matrix containing the corresponding `p`-values for the test.
`theta`	A `length(u)` by `length(k)` numeric matrix containing the corresponding estimates of `\theta`.
`u`, `k`	The input `u` and `k`.

References

Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292

Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf

Examples

### Newlyn sea surges

u <- quantile(newlyn, probs = seq(0.1, 0.9, by = 0.1))
imt <- kgaps_imt(newlyn, u = u, k = 1:5)

### S&P 500 index

u <- quantile(sp500, probs = seq(0.1, 0.9, by = 0.1))
imt <- kgaps_imt(sp500, u = u, k = 1:5)

### Cheeseboro wind gusts (a matrix containing some NAs)

probs <- c(seq(0.5, 0.98, by = 0.025), 0.99)
u <- quantile(cheeseboro, probs = probs, na.rm = TRUE)
imt <- kgaps_imt(cheeseboro, u = u, k = 1:5)

[Package exdex version 1.2.3 Index]

Information matrix test under the K-gaps model