R: Weibull Bulk and GPD Tail Interval Transition Mixture Model

itmweibullgpd {evmix}

R Documentation

Weibull Bulk and GPD Tail Interval Transition Mixture Model

Description

Density, cumulative distribution function, quantile function and random number generation for the Weibull bulk and GPD tail interval transition mixture model. The parameters are the Weibull shape wshape and scale wscale, threshold u, interval half-width epsilon, GPD scale sigmau and shape xi.

Usage

ditmweibullgpd(x, wshape = 1, wscale = 1, epsilon = sqrt(wscale^2 *
  gamma(1 + 2/wshape) - (wscale * gamma(1 + 1/wshape))^2),
  u = qweibull(0.9, wshape, wscale), sigmau = sqrt(wscale^2 * gamma(1 +
  2/wshape) - (wscale * gamma(1 + 1/wshape))^2), xi = 0, log = FALSE)

pitmweibullgpd(q, wshape = 1, wscale = 1, epsilon = sqrt(wscale^2 *
  gamma(1 + 2/wshape) - (wscale * gamma(1 + 1/wshape))^2),
  u = qweibull(0.9, wshape, wscale), sigmau = sqrt(wscale^2 * gamma(1 +
  2/wshape) - (wscale * gamma(1 + 1/wshape))^2), xi = 0,
  lower.tail = TRUE)

qitmweibullgpd(p, wshape = 1, wscale = 1, epsilon = sqrt(wscale^2 *
  gamma(1 + 2/wshape) - (wscale * gamma(1 + 1/wshape))^2),
  u = qweibull(0.9, wshape, wscale), sigmau = sqrt(wscale^2 * gamma(1 +
  2/wshape) - (wscale * gamma(1 + 1/wshape))^2), xi = 0,
  lower.tail = TRUE)

ritmweibullgpd(n = 1, wshape = 1, wscale = 1,
  epsilon = sqrt(wscale^2 * gamma(1 + 2/wshape) - (wscale * gamma(1 +
  1/wshape))^2), u = qweibull(0.9, wshape, wscale),
  sigmau = sqrt(wscale^2 * gamma(1 + 2/wshape) - (wscale * gamma(1 +
  1/wshape))^2), xi = 0)

Arguments

`x`	quantiles
`wshape`	Weibull shape (positive)
`wscale`	Weibull scale (positive)
`epsilon`	interval half-width
`u`	threshold
`sigmau`	scale parameter (positive)
`xi`	shape parameter
`log`	logical, if TRUE then log density
`q`	quantiles
`lower.tail`	logical, if FALSE then upper tail probabilities
`p`	cumulative probabilities
`n`	sample size (positive integer)

Details

The interval transition mixture model combines a Weibull for the bulk model with GPD for the tail model, with a smooth transition over the interval (u-epsilon, u+epsilon). The mixing function warps the Weibull to map from (u-epsilon, u) to (u-epsilon, u+epsilon) and warps the GPD from (u, u+epsilon) to (u-epsilon, u+epsilon).

The cumulative distribution function is defined by

F(x)=\kappa(H_t(q(x)) + G(p(x)))

where H_t(x) and G(X) are the truncated Weibull and conditional GPD cumulative distribution functions (i.e. pweibull(x, wshape, wscale) and pgpd(x, u, sigmau, xi)) respectively. The truncated Weibull is not renormalised to be proper, so H_t(x) contrubutes pweibull(u, wshape, wscale) to the cdf for all x\geq (u + \epsilon). The normalisation constant \kappa ensures a proper density, given by 1/(1+pweibull(u, wshape, wscale)) where 1 is from GPD component and latter is contribution from Weibull component.

The mixing functions q(x) and p(x) suggested by Holden and Haug (2013) have been implemented. These are symmetric about the threshold u. So for computational convenience only q(x;u) has been implemented as qmix for a given u, with the complementary mixing function is then defined as p(x;u)=-q(-x;-u).

A minor adaptation of the mixing function has been applied. For the mixture model to function correctly q(x)>=u for all x\ge u+\epsilon, as then the bulk model will contribute the constant H_t(u)=H(u) for all x above the interval. Holden and Haug (2013) define q(x)=x-\epsilon for all x\ge u. For more straightforward and interpretable computational implementation the mixing function has been set to the threshold q(x)=u for all x\ge u, so the cdf/pdf of the Weibull model can be used directly. We do not have to define cdf/pdf for the non-proper truncated Weibull seperately. As such q'(x)=0 for all x\ge u in qmixxprime, which also it makes clearer that Weibull does not contribute to the tail above the interval and vice-versa.

The quantile function within the transition interval is not available in closed form, so has to be solved numerically. Outside of the interval, the quantile are obtained from the Weibull and GPD components directly.

Value

ditmweibullgpd gives the density, pitmweibullgpd gives the cumulative distribution function, qitmweibullgpd gives the quantile function and ritmweibullgpd gives a random sample.

Note

All inputs are vectorised except log and lower.tail. The main inputs (x, p or q) and parameters must be either a scalar or a vector. If vectors are provided they must all be of the same length, and the function will be evaluated for each element of vector. In the case of ritmweibullgpd any input vector must be of length n.

Default values are provided for all inputs, except for the fundamentals x, q and p. The default sample size for ritmweibullgpd is 1.

Missing (NA) and Not-a-Number (NaN) values in x, p and q are passed through as is and infinite values are set to NA. None of these are not permitted for the parameters.

Error checking of the inputs (e.g. invalid probabilities) is carried out and will either stop or give warning message as appropriate.

Author(s)

Alfadino Akbar and Carl Scarrott carl.scarrott@canterbury.ac.nz

References

http://en.wikipedia.org/wiki/Weibull_distribution

http://en.wikipedia.org/wiki/Generalized_Pareto_distribution

Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf

Holden, L. and Haug, O. (2013). A mixture model for unsupervised tail estimation. arxiv:0902.4137

Examples

## Not run: 
set.seed(1)
par(mfrow = c(2, 2))

xx = seq(0.001, 5, 0.01)
u = 1.5
epsilon = 0.4
kappa = 1/(1 + pweibull(u, 2, 1))

f = ditmweibullgpd(xx, wshape = 2, wscale = 1, epsilon, u, sigmau = 1, xi = 0.5)
plot(xx, f, ylim = c(0, 1), xlim = c(0, 5), type = 'l', lwd = 2, xlab = "x", ylab = "density")
lines(xx, kappa * dgpd(xx, u, sigmau = 1, xi = 0.5), col = "red", lty = 2, lwd = 2)
lines(xx, kappa * dweibull(xx, 2, 1), col = "blue", lty = 2, lwd = 2)
abline(v = u + epsilon * seq(-1, 1), lty = c(2, 1, 2))
legend('topright', c('Weibull-GPD ITM', 'kappa*Weibull', 'kappa*GPD'),
      col = c("black", "blue", "red"), lty = c(1, 2, 2), lwd = 2)

# cdf contributions
F = pitmweibullgpd(xx, wshape = 2, wscale = 1, epsilon, u, sigmau = 1, xi = 0.5)
plot(xx, F, ylim = c(0, 1), xlim = c(0, 5), type = 'l', lwd = 2, xlab = "x", ylab = "cdf")
lines(xx[xx > u], kappa * (pweibull(u, 2, 1) + pgpd(xx[xx > u], u, sigmau = 1, xi = 0.5)),
     col = "red", lty = 2, lwd = 2)
lines(xx[xx <= u], kappa * pweibull(xx[xx <= u], 2, 1), col = "blue", lty = 2, lwd = 2)
abline(v = u + epsilon * seq(-1, 1), lty = c(2, 1, 2))
legend('topright', c('Weibull-GPD ITM', 'kappa*Weibull', 'kappa*GPD'),
      col = c("black", "blue", "red"), lty = c(1, 2, 2), lwd = 2)

# simulated data density histogram and overlay true density 
x = ritmweibullgpd(10000, wshape = 2, wscale = 1, epsilon, u, sigmau = 1, xi = 0.5)
hist(x, freq = FALSE, breaks = seq(0, 1000, 0.1), xlim = c(0, 5))
lines(xx, ditmweibullgpd(xx, wshape = 2, wscale = 1, epsilon, u, sigmau = 1, xi = 0.5),
  lwd = 2, col = 'black')  

## End(Not run)

[Package evmix version 2.12 Index]