R: Sufficient statistics for the left-censored inter-exceedances...

dgaps_stat {exdex}

R Documentation

Sufficient statistics for the left-censored inter-exceedances time model

Description

Calculates sufficient statistics for the the left-censored inter-exceedances time D-gaps model for the extremal index \theta.

Usage

dgaps_stat(data, u, q_u, D = 1, inc_cens = TRUE)

Arguments

`data`	A numeric vector of raw data. No missing values are allowed.
`u`	A numeric scalar. Extreme value threshold applied to data.
`q_u`	A numeric scalar. An estimate of the probability with which the threshold `u` is exceeded. If `q_u` is missing then it is calculated using `mean(data > u, na.rm = TRUE)`.
`D`	A numeric scalar. Run parameter `K`, as defined in Suveges and Davison (2010). Threshold inter-exceedances times that are not larger than `k` units are assigned to the same cluster, resulting in a `K`-gap equal to zero. Specifically, the `K`-gap `S` corresponding to an inter-exceedance time of `T` is given by `S = \max(T - K, 0)`.
`inc_cens`	A logical scalar indicating whether or not to include contributions from right-censored inter-exceedance times relating to the first and last observation. It is known that these times are greater than or equal to the time observed. See Attalides (2015) for details.

Details

The sample inter-exceedance times are T_0, T_1, ..., T_{N-1}, T_N, where T_1, ..., T_{N-1} are uncensored and T_0 and T_N are right-censored. Under the assumption that the inter-exceedance times are independent, the log-likelihood of the D-gaps model is given by

l(\theta; T_0, \ldots, T_N) = N_0 \log(1 - \theta e^{-\theta d}) + 2 N_1 \log \theta - \theta q (I_0 T_0 + \cdots + I_N T_N),

where

q is the threshold exceedance probability, estimated by the proportion of threshold exceedances,
d = q D,
I_j = 1 if T_j > D and I_j = 0 otherwise,
N_0 is the number of sample inter-exceedance times that are left-censored, that is, are less than or equal to D,
(apart from an adjustment for the contributions of T_0 and T_N) N_1 is the number of inter-exceedance times that are uncensored, that is, are greater than D,
specifically, if inc_cens = TRUE then N_1 is equal to the number of T_1, ..., T_{N-1} that are uncensored plus (I_0 + I_N) / 2.

The differing treatment of uncensored and censored K-gaps reflects differing contributions to the likelihood. Right-censored inter-exceedance times whose observed values are less than or equal to D add no information to the likelihood because we do not know to which part of the likelihood they should contribute.

If N_1 = 0 then we are in the degenerate case where there is one cluster (all inter-exceedance times are left-censored) and the likelihood is maximized at \theta = 0.

If N_0 = 0 then all exceedances occur singly (no inter-exceedance times are left-censored) and the likelihood is maximized at \theta = 1.

Value

A list containing the sufficient statistics, with components

`N0`	the number of left-censored inter-exceedance times.
`N1`	contribution from inter-exceedance times that are not left-censored (see Details).
`sum_qtd`	the sum of the (scaled) inter-exceedance times that are not left-censored, that is, `q (I_0 T_0 + \cdots + I_N T_N)`, where `q` is estimated by the proportion of threshold exceedances.
`n_dgaps`	the number of inter-exceedances that contribute to the log-likelihood.
`q_u`	the sample proportion of values that exceed the threshold.
`D`	the input value of `D`.

References

Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3

Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf

Examples

u <- quantile(newlyn, probs = 0.90)
dgaps_stat(newlyn, u = u, D = 1)

[Package exdex version 1.2.3 Index]