fbckdengpdcon {evmix}R Documentation

MLE Fitting of Boundary Corrected Kernel Density Estimate for Bulk and GPD Tail Extreme Value Mixture Model with Single Continuity Constraint

Description

Maximum likelihood estimation for fitting the extreme value mixture model with boundary corrected kernel density estimate for bulk distribution upto the threshold and conditional GPD above thresholdwith continuity at threshold. With options for profile likelihood estimation for threshold and fixed threshold approach.

Usage

fbckdengpdcon(x, phiu = TRUE, useq = NULL, fixedu = FALSE,
  pvector = NULL, kernel = "gaussian", bcmethod = "simple",
  proper = TRUE, nn = "jf96", offset = NULL, xmax = NULL,
  add.jitter = FALSE, factor = 0.1, amount = NULL, std.err = TRUE,
  method = "BFGS", control = list(maxit = 10000), finitelik = TRUE,
  ...)

lbckdengpdcon(x, lambda = NULL, u = 0, xi = 0, phiu = TRUE,
  bw = NULL, kernel = "gaussian", bcmethod = "simple",
  proper = TRUE, nn = "jf96", offset = NULL, xmax = NULL,
  log = TRUE)

nlbckdengpdcon(pvector, x, phiu = TRUE, kernel = "gaussian",
  bcmethod = "simple", proper = TRUE, nn = "jf96", offset = NULL,
  xmax = NULL, finitelik = FALSE)

proflubckdengpdcon(u, pvector, x, phiu = TRUE, kernel = "gaussian",
  bcmethod = "simple", proper = TRUE, nn = "jf96", offset = NULL,
  xmax = NULL, method = "BFGS", control = list(maxit = 10000),
  finitelik = TRUE, ...)

nlubckdengpdcon(pvector, u, x, phiu = TRUE, kernel = "gaussian",
  bcmethod = "simple", proper = TRUE, nn = "jf96", offset = NULL,
  xmax = NULL, finitelik = FALSE)

Arguments

x

vector of sample data

phiu

probability of being above threshold (0, 1) or logical, see Details in help for fnormgpd

useq

vector of thresholds (or scalar) to be considered in profile likelihood or NULL for no profile likelihood

fixedu

logical, should threshold be fixed (at either scalar value in useq, or estimated from maximum of profile likelihood evaluated at sequence of thresholds in useq)

pvector

vector of initial values of parameters or NULL for default values, see below

kernel

kernel name (default = "gaussian")

bcmethod

boundary correction method

proper

logical, whether density is renormalised to integrate to unity (where needed)

nn

non-negativity correction method (simple boundary correction only)

offset

offset added to kernel centres (logtrans only) or NULL

xmax

upper bound on support (copula and beta kernels only) or NULL

add.jitter

logical, whether jitter is needed for rounded kernel centres

factor

see jitter

amount

see jitter

std.err

logical, should standard errors be calculated

method

optimisation method (see optim)

control

optimisation control list (see optim)

finitelik

logical, should log-likelihood return finite value for invalid parameters

...

optional inputs passed to optim

lambda

bandwidth for kernel (as half-width of kernel) or NULL

u

scalar threshold value

xi

scalar shape parameter

bw

bandwidth for kernel (as standard deviations of kernel) or NULL

log

logical, if TRUE then log-likelihood rather than likelihood is output

Details

The extreme value mixture model with boundary corrected kernel density estimate (BCKDE) for bulk and GPD tail with continuity at threshold is fitted to the entire dataset using maximum likelihood estimation. The estimated parameters, variance-covariance matrix and their standard errors are automatically output.

See help for fnormgpd for details, type help fnormgpd. Only the different features are outlined below for brevity.

The GPD sigmau parameter is now specified as function of other parameters, see help for dbckdengpdcon for details, type help bckdengpdcon. Therefore, sigmau should not be included in the parameter vector if initial values are provided, making the full parameter vector (lambda, u, xi) if threshold is also estimated and (lambda, xi) for profile likelihood or fixed threshold approach.

Negative data are ignored.

Cross-validation likelihood is used for BCKDE, but standard likelihood is used for GPD component. See help for fkden for details, type help fkden.

The alternate bandwidth definitions are discussed in the kernels, with the lambda as the default used in the likelihood fitting. The bw specification is the same as used in the density function.

The possible kernels are also defined in kernels with the "gaussian" as the default choice.

Unlike the standard KDE, there is no general rule-of-thumb bandwidth for all these estimators, with only certain methods having a guideline in the literature, so none have been implemented. Hence, a bandwidth must always be specified.

The simple, renorm, beta1, beta2 gamma1 and gamma2 boundary corrected kernel density estimates require renormalisation, achieved by numerical integration, so are very time consuming.

Value

lbckdengpdcon, nlbckdengpdcon, and nlubckdengpdcon give the log-likelihood, negative log-likelihood and profile likelihood for threshold. Profile likelihood for single threshold is given by proflubckdengpdcon. fbckdengpdcon returns a simple list with the following elements

call: optim call
x: data vector x
init: pvector
fixedu: fixed threshold, logical
useq: threshold vector for profile likelihood or scalar for fixed threshold
nllhuseq: profile negative log-likelihood at each threshold in useq
optim: complete optim output
mle: vector of MLE of parameters
cov: variance-covariance matrix of MLE of parameters
se: vector of standard errors of MLE of parameters
rate: phiu to be consistent with evd
nllh: minimum negative log-likelihood
n: total sample size
lambda: MLE of lambda (kernel half-width)
u: threshold (fixed or MLE)
sigmau: MLE of GPD scale(estimated from other parameters)
xi: MLE of GPD shape
phiu: MLE of tail fraction (bulk model or parameterised approach)
se.phiu: standard error of MLE of tail fraction
bw: MLE of bw (kernel standard deviations)
kernel: kernel name
bcmethod: boundary correction method
proper: logical, whether renormalisation is requested
nn: non-negative correction method
offset: offset for log transformation method
xmax: maximum value of scaled beta or copula

Boundary Correction Methods

See dbckden for details of BCKDE methods.

Warning

See important warnings about cross-validation likelihood estimation in fkden, type help fkden.

See important warnings about boundary correction approaches in dbckden, type help bckden.

Acknowledgments

See Acknowledgments in fnormgpd, type help fnormgpd. Based on code by Anna MacDonald produced for MATLAB.

Note

See notes in fnormgpd for details, type help fnormgpd. Only the different features are outlined below for brevity.

No default initial values for parameter vector are provided, so will stop evaluation if pvector is left as NULL. Avoid setting the starting value for the shape parameter to xi=0 as depending on the optimisation method it may be get stuck.

The data and kernel centres are both vectors. Infinite, missing and negative sample values (and kernel centres) are dropped.

Author(s)

Yang Hu and Carl Scarrott carl.scarrott@canterbury.ac.nz

References

http://www.math.canterbury.ac.nz/~c.scarrott/evmix

http://en.wikipedia.org/wiki/Kernel_density_estimation

http://en.wikipedia.org/wiki/Cross-validation_(statistics)

http://en.wikipedia.org/wiki/Generalized_Pareto_distribution

Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf

Hu, Y. (2013). Extreme value mixture modelling: An R package and simulation study. MSc (Hons) thesis, University of Canterbury, New Zealand. http://ir.canterbury.ac.nz/simple-search?query=extreme&submit=Go

Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2), 353-360.

Duin, R.P.W. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers C25(11), 1175-1179.

MacDonald, A., Scarrott, C.J., Lee, D., Darlow, B., Reale, M. and Russell, G. (2011). A flexible extreme value mixture model. Computational Statistics and Data Analysis 55(6), 2137-2157.

MacDonald, A., C. J. Scarrott, and D. S. Lee (2011). Boundary correction, consistency and robustness of kernel densities using extreme value theory. Submitted. Available from: http://www.math.canterbury.ac.nz/~c.scarrott.

Wand, M. and Jones, M.C. (1995). Kernel Smoothing. Chapman && Hall.

See Also

kernels, kfun, density, bw.nrd0 and dkde in ks package. fgpd and gpd.

Other kdengpdcon: bckdengpdcon, fgkgcon, fkdengpdcon, fkdengpd, gkgcon, kdengpdcon, kdengpd

Other bckden: bckdengpdcon, bckdengpd, bckden, fbckdengpd, fbckden, fkden, kden

Other bckdengpd: bckdengpdcon, bckdengpd, bckden, fbckdengpd, fbckden, fkdengpd, gkg, kdengpd, kden

Other bckdengpdcon: bckdengpdcon, bckdengpd, bckden, fbckdengpd, fbckden, fkdengpdcon, gkgcon, kdengpdcon

Other fbckdengpdcon: bckdengpdcon

Examples

## Not run: 
set.seed(1)
par(mfrow = c(2, 1))

x = rgamma(500, 2, 1)
xx = seq(-0.1, 10, 0.01)
y = dgamma(xx, 2, 1)

# Continuity constraint
pinit = c(0.1, quantile(x, 0.9), 0.1) # initial values required for BCKDE
fit = fbckdengpdcon(x, pvector = pinit, bcmethod = "cutnorm")
hist(x, breaks = 100, freq = FALSE, xlim = c(-0.1, 10))
lines(xx, y)
with(fit, lines(xx, dbckdengpdcon(xx, x, lambda, u, xi, bcmethod = "cutnorm"), col="red"))
abline(v = fit$u, col = "red")
  
# No continuity constraint
pinit = c(0.1, quantile(x, 0.9), 1, 0.1) # initial values required for BCKDE
fit2 = fbckdengpd(x, pvector = pinit, bcmethod = "cutnorm")
with(fit2, lines(xx, dbckdengpd(xx, x, lambda, u, sigmau, xi, bc = "cutnorm"), col="blue"))
abline(v = fit2$u, col = "blue")
legend("topright", c("True Density","No continuity constraint","With continuty constraint"),
  col=c("black", "blue", "red"), lty = 1)
  
# Profile likelihood for initial value of threshold and fixed threshold approach
pinit = c(0.1, 0.1) # notice threshold dropped from initial values
fitu = fbckdengpdcon(x, useq = seq(1, 6, length = 20), pvector = pinit, bcmethod = "cutnorm")
fitfix = fbckdengpdcon(x, useq = seq(1, 6, length = 20), fixedu = TRUE, pv = pinit, bc = "cutnorm")

hist(x, breaks = 100, freq = FALSE, xlim = c(-0.1, 10))
lines(xx, y)
with(fit, lines(xx, dbckdengpdcon(xx, x, lambda, u, xi, bc = "cutnorm"), col="red"))
abline(v = fit$u, col = "red")
with(fitu, lines(xx, dbckdengpdcon(xx, x, lambda, u, xi, bc = "cutnorm"), col="purple"))
abline(v = fitu$u, col = "purple")
with(fitfix, lines(xx, dbckdengpdcon(xx, x, lambda, u, xi, bc = "cutnorm"), col="darkgreen"))
abline(v = fitfix$u, col = "darkgreen")
legend("topright", c("True Density","Default initial value (90% quantile)",
 "Prof. lik. for initial value", "Prof. lik. for fixed threshold"),
 col=c("black", "red", "purple", "darkgreen"), lty = 1)

## End(Not run)


[Package evmix version 2.12 Index]