fgkgcon {evmix}R Documentation

MLE Fitting of Kernel Density Estimate for Bulk and GPD for Both Tails with Single Continuity Constraint at Both Thresholds Extreme Value Mixture Model

Description

Maximum likelihood estimation for fitting the extreme value mixture model with kernel density estimate for bulk distribution between thresholds and conditional GPDs for both tails with continuity at thresholds. With options for profile likelihood estimation for both thresholds and fixed threshold approach.

Usage

fgkgcon(x, phiul = TRUE, phiur = TRUE, ulseq = NULL, urseq = NULL,
  fixedu = FALSE, pvector = NULL, kernel = "gaussian",
  add.jitter = FALSE, factor = 0.1, amount = NULL, std.err = TRUE,
  method = "BFGS", control = list(maxit = 10000), finitelik = TRUE,
  ...)

lgkgcon(x, lambda = NULL, ul = 0, xil = 0, phiul = TRUE, ur = 0,
  xir = 0, phiur = TRUE, bw = NULL, kernel = "gaussian",
  log = TRUE)

nlgkgcon(pvector, x, phiul = TRUE, phiur = TRUE, kernel = "gaussian",
  finitelik = FALSE)

proflugkgcon(ulr, pvector, x, phiul = TRUE, phiur = TRUE,
  kernel = "gaussian", method = "BFGS", control = list(maxit =
  10000), finitelik = TRUE, ...)

nlugkgcon(pvector, ul, ur, x, phiul = TRUE, phiur = TRUE,
  kernel = "gaussian", finitelik = FALSE)

Arguments

x

vector of sample data

phiul

probability of being below lower threshold (0, 1) or logical, see Details in help for fgng

phiur

probability of being above upper threshold (0, 1) or logical, see Details in help for fgng

ulseq

vector of lower thresholds (or scalar) to be considered in profile likelihood or NULL for no profile likelihood

urseq

vector of upper thresholds (or scalar) to be considered in profile likelihood or NULL for no profile likelihood

fixedu

logical, should threshold be fixed (at either scalar value in ulseq/urseq, or estimated from maximum of profile likelihood evaluated at sequence of thresholds in ulseq/urseq)

pvector

vector of initial values of parameters or NULL for default values, see below

kernel

kernel name (default = "gaussian")

add.jitter

logical, whether jitter is needed for rounded kernel centres

factor

see jitter

amount

see jitter

std.err

logical, should standard errors be calculated

method

optimisation method (see optim)

control

optimisation control list (see optim)

finitelik

logical, should log-likelihood return finite value for invalid parameters

...

optional inputs passed to optim

lambda

scalar bandwidth for kernel (as half-width of kernel)

ul

scalar lower tail threshold

xil

scalar lower tail GPD shape parameter

ur

scalar upper tail threshold

xir

scalar upper tail GPD shape parameter

bw

scalar bandwidth for kernel (as standard deviations of kernel)

log

logical, if TRUE then log-likelihood rather than likelihood is output

ulr

vector of length 2 giving lower and upper tail thresholds or NULL for default values

Details

The extreme value mixture model with kernel density estimate for bulk and GPD for both tails with continuity at thresholds is fitted to the entire dataset using maximum likelihood estimation. The estimated parameters, variance-covariance matrix and their standard errors are automatically output.

See help for fnormgpd and fgng for details, type help fnormgpd and help fgng. Only the different features are outlined below for brevity.

The GPD sigmaul and sigmaur parameters are now specified as function of other parameters, see help for dgkgcon for details, type help gkgcon. Therefore, sigmaul and sigmaur should not be included in the parameter vector if initial values are provided, making the full parameter vector The full parameter vector is (lambda, ul, xil, ur, xir) if thresholds are also estimated and (lambda, xil, xir) for profile likelihood or fixed threshold approach.

Cross-validation likelihood is used for KDE, but standard likelihood is used for GPD components. See help for fkden for details, type help fkden.

The alternate bandwidth definitions are discussed in the kernels, with the lambda as the default used in the likelihood fitting. The bw specification is the same as used in the density function.

The possible kernels are also defined in kernels with the "gaussian" as the default choice.

The tail fractions phiul and phiur are treated separately to the other parameters, to allow for all their representations. In the fitting functions fgkgcon and proflugkgcon they are logical:

In the likelihood functions lgkgcon, nlgkgcon and nlugkgcon it can be logical or numeric:

If the profile likelihood approach is used, then a grid search over all combinations of both thresholds is carried out. The combinations which lead to less than 5 in any datapoints beyond the thresholds are not considered.

Value

Log-likelihood is given by lgkgcon and it's wrappers for negative log-likelihood from nlgkgcon and nlugkgcon. Profile likelihood for both thresholds given by proflugkgcon. Fitting function fgkgcon returns a simple list with the following elements

call: optim call
x: data vector x
init: pvector
fixedu: fixed thresholds, logical
ulseq: lower threshold vector for profile likelihood or scalar for fixed threshold
urseq: upper threshold vector for profile likelihood or scalar for fixed threshold
nllhuseq: profile negative log-likelihood at each threshold pair in (ulseq, urseq)
optim: complete optim output
mle: vector of MLE of parameters
cov: variance-covariance matrix of MLE of parameters
se: vector of standard errors of MLE of parameters
rate: phiu to be consistent with evd
nllh: minimum negative log-likelihood
n: total sample size
lambda: MLE of lambda (kernel half-width)
ul: lower threshold (fixed or MLE)
sigmaul: MLE of lower tail GPD scale (estimated from other parameters)
xil: MLE of lower tail GPD shape
phiul: MLE of lower tail fraction (bulk model or parameterised approach)
se.phiul: standard error of MLE of lower tail fraction
ur: upper threshold (fixed or MLE)
sigmaur: MLE of upper tail GPD scale (estimated from other parameters)
xir: MLE of upper tail GPD shape
phiur: MLE of upper tail fraction (bulk model or parameterised approach)
se.phiur: standard error of MLE of lower tail fraction
bw: MLE of bw (kernel standard deviations)
kernel: kernel name

Warning

See important warnings about cross-validation likelihood estimation in fkden, type help fkden.

Acknowledgments

See Acknowledgments in fnormgpd, type help fnormgpd. Based on code by Anna MacDonald produced for MATLAB.

Note

The data and kernel centres are both vectors. Infinite and missing sample values (and kernel centres) are dropped.

When pvector=NULL then the initial values are:

Author(s)

Yang Hu and Carl Scarrott carl.scarrott@canterbury.ac.nz

References

http://www.math.canterbury.ac.nz/~c.scarrott/evmix

http://en.wikipedia.org/wiki/Kernel_density_estimation

http://en.wikipedia.org/wiki/Cross-validation_(statistics)

http://en.wikipedia.org/wiki/Generalized_Pareto_distribution

Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf

Hu, Y. (2013). Extreme value mixture modelling: An R package and simulation study. MSc (Hons) thesis, University of Canterbury, New Zealand. http://ir.canterbury.ac.nz/simple-search?query=extreme&submit=Go

Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2), 353-360.

Duin, R.P.W. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers C25(11), 1175-1179.

MacDonald, A., Scarrott, C.J., Lee, D., Darlow, B., Reale, M. and Russell, G. (2011). A flexible extreme value mixture model. Computational Statistics and Data Analysis 55(6), 2137-2157.

Wand, M. and Jones, M.C. (1995). Kernel Smoothing. Chapman && Hall.

See Also

kernels, kfun, density, bw.nrd0 and dkde in ks package. fgpd and gpd.

Other kden: bckden, fbckden, fgkg, fkdengpdcon, fkdengpd, fkden, kdengpdcon, kdengpd, kden

Other kdengpdcon: bckdengpdcon, fbckdengpdcon, fkdengpdcon, fkdengpd, gkgcon, kdengpdcon, kdengpd

Other gkg: fgkg, fkdengpd, gkgcon, gkg, kdengpd, kden

Other gkgcon: fgkg, fkdengpdcon, gkgcon, gkg, kdengpdcon

Other fgkgcon: gkgcon

Examples

## Not run: 
set.seed(1)
par(mfrow = c(2, 1))

x = rnorm(1000)
xx = seq(-4, 4, 0.01)
y = dnorm(xx)

# Continuity constraint
fit = fgkgcon(x)
hist(x, breaks = 100, freq = FALSE, xlim = c(-4, 4))
lines(xx, y)
with(fit, lines(xx, dgkgcon(xx, x, lambda, ul, xil, phiul,
   ur, xir, phiur), col="red"))
abline(v = c(fit$ul, fit$ur), col = "red")
  
# No continuity constraint
fit2 = fgkg(x)
with(fit2, lines(xx, dgkg(xx, x, lambda, ul, sigmaul, xil, phiul,
   ur, sigmaur, xir, phiur), col="blue"))
abline(v = c(fit2$ul, fit2$ur), col = "blue")
legend("topleft", c("True Density","No continuity constraint","With continuty constraint"),
  col=c("black", "blue", "red"), lty = 1)
  
# Profile likelihood for initial value of threshold and fixed threshold approach
fitu = fgkgcon(x, ulseq = seq(-2, -0.2, length = 10), 
 urseq = seq(0.2, 2, length = 10))
fitfix = fgkgcon(x, ulseq = seq(-2, -0.2, length = 10), 
 urseq = seq(0.2, 2, length = 10), fixedu = TRUE)

hist(x, breaks = 100, freq = FALSE, xlim = c(-4, 4))
lines(xx, y)
with(fit, lines(xx, dgkgcon(xx, x, lambda, ul, xil, phiul,
   ur, xir, phiur), col="red"))
abline(v = c(fit$ul, fit$ur), col = "red")
with(fitu, lines(xx, dgkgcon(xx, x, lambda, ul, xil, phiul,
   ur, xir, phiur), col="purple"))
abline(v = c(fitu$ul, fitu$ur), col = "purple")
with(fitfix, lines(xx, dgkgcon(xx, x, lambda, ul, xil, phiul,
   ur, xir, phiur), col="darkgreen"))
abline(v = c(fitfix$ul, fitfix$ur), col = "darkgreen")
legend("topright", c("True Density","Default initial value (90% quantile)",
 "Prof. lik. for initial value", "Prof. lik. for fixed threshold"),
 col=c("black", "red", "purple", "darkgreen"), lty = 1)

## End(Not run)
  

[Package evmix version 2.12 Index]