fbckdengpdcon {evmix} | R Documentation |
MLE Fitting of Boundary Corrected Kernel Density Estimate for Bulk and GPD Tail Extreme Value Mixture Model with Single Continuity Constraint
Description
Maximum likelihood estimation for fitting the extreme value mixture model with boundary corrected kernel density estimate for bulk distribution upto the threshold and conditional GPD above thresholdwith continuity at threshold. With options for profile likelihood estimation for threshold and fixed threshold approach.
Usage
fbckdengpdcon(x, phiu = TRUE, useq = NULL, fixedu = FALSE,
pvector = NULL, kernel = "gaussian", bcmethod = "simple",
proper = TRUE, nn = "jf96", offset = NULL, xmax = NULL,
add.jitter = FALSE, factor = 0.1, amount = NULL, std.err = TRUE,
method = "BFGS", control = list(maxit = 10000), finitelik = TRUE,
...)
lbckdengpdcon(x, lambda = NULL, u = 0, xi = 0, phiu = TRUE,
bw = NULL, kernel = "gaussian", bcmethod = "simple",
proper = TRUE, nn = "jf96", offset = NULL, xmax = NULL,
log = TRUE)
nlbckdengpdcon(pvector, x, phiu = TRUE, kernel = "gaussian",
bcmethod = "simple", proper = TRUE, nn = "jf96", offset = NULL,
xmax = NULL, finitelik = FALSE)
proflubckdengpdcon(u, pvector, x, phiu = TRUE, kernel = "gaussian",
bcmethod = "simple", proper = TRUE, nn = "jf96", offset = NULL,
xmax = NULL, method = "BFGS", control = list(maxit = 10000),
finitelik = TRUE, ...)
nlubckdengpdcon(pvector, u, x, phiu = TRUE, kernel = "gaussian",
bcmethod = "simple", proper = TRUE, nn = "jf96", offset = NULL,
xmax = NULL, finitelik = FALSE)
Arguments
x |
vector of sample data |
phiu |
probability of being above threshold |
useq |
vector of thresholds (or scalar) to be considered in profile likelihood or
|
fixedu |
logical, should threshold be fixed (at either scalar value in |
pvector |
vector of initial values of parameters or |
kernel |
kernel name ( |
bcmethod |
boundary correction method |
proper |
logical, whether density is renormalised to integrate to unity (where needed) |
nn |
non-negativity correction method (simple boundary correction only) |
offset |
offset added to kernel centres (logtrans only) or |
xmax |
upper bound on support (copula and beta kernels only) or |
add.jitter |
logical, whether jitter is needed for rounded kernel centres |
factor |
see |
amount |
see |
std.err |
logical, should standard errors be calculated |
method |
optimisation method (see |
control |
optimisation control list (see |
finitelik |
logical, should log-likelihood return finite value for invalid parameters |
... |
optional inputs passed to |
lambda |
bandwidth for kernel (as half-width of kernel) or |
u |
scalar threshold value |
xi |
scalar shape parameter |
bw |
bandwidth for kernel (as standard deviations of kernel) or |
log |
logical, if |
Details
The extreme value mixture model with boundary corrected kernel density estimate (BCKDE) for bulk and GPD tail with continuity at threshold is fitted to the entire dataset using maximum likelihood estimation. The estimated parameters, variance-covariance matrix and their standard errors are automatically output.
See help for fnormgpd
for details, type help fnormgpd
.
Only the different features are outlined below for brevity.
The GPD sigmau
parameter is now specified as function of other parameters, see
help for dbckdengpdcon
for details, type help bckdengpdcon
.
Therefore, sigmau
should not be included in the parameter vector if initial values
are provided, making the full parameter vector
(lambda
, u
, xi
) if threshold is also estimated and
(lambda
, xi
) for profile likelihood or fixed threshold approach.
Negative data are ignored.
Cross-validation likelihood is used for BCKDE, but standard likelihood is used
for GPD component. See help for fkden
for details,
type help fkden
.
The alternate bandwidth definitions are discussed in the
kernels
, with the lambda
as the default
used in the likelihood fitting. The bw
specification is the same as
used in the density
function.
The possible kernels are also defined in kernels
with the "gaussian"
as the default choice.
Unlike the standard KDE, there is no general rule-of-thumb bandwidth for all these estimators, with only certain methods having a guideline in the literature, so none have been implemented. Hence, a bandwidth must always be specified.
The simple
, renorm
, beta1
, beta2
gamma1
and gamma2
boundary corrected kernel density estimates require renormalisation, achieved
by numerical integration, so are very time consuming.
Value
lbckdengpdcon
, nlbckdengpdcon
,
and nlubckdengpdcon
give the log-likelihood,
negative log-likelihood and profile likelihood for threshold. Profile likelihood
for single threshold is given by proflubckdengpdcon
.
fbckdengpdcon
returns a simple list with the following elements
call : | optim call |
x : | data vector x |
init : | pvector |
fixedu : | fixed threshold, logical |
useq : | threshold vector for profile likelihood or scalar for fixed threshold |
nllhuseq : | profile negative log-likelihood at each threshold in useq |
optim : | complete optim output |
mle : | vector of MLE of parameters |
cov : | variance-covariance matrix of MLE of parameters |
se : | vector of standard errors of MLE of parameters |
rate : | phiu to be consistent with evd |
nllh : | minimum negative log-likelihood |
n : | total sample size |
lambda : | MLE of lambda (kernel half-width) |
u : | threshold (fixed or MLE) |
sigmau : | MLE of GPD scale(estimated from other parameters) |
xi : | MLE of GPD shape |
phiu : | MLE of tail fraction (bulk model or parameterised approach) |
se.phiu : | standard error of MLE of tail fraction |
bw : | MLE of bw (kernel standard deviations) |
kernel : | kernel name |
bcmethod : | boundary correction method |
proper : | logical, whether renormalisation is requested |
nn : | non-negative correction method |
offset : | offset for log transformation method |
xmax : | maximum value of scaled beta or copula |
Boundary Correction Methods
See dbckden
for details of BCKDE methods.
Warning
See important warnings about cross-validation likelihood estimation in
fkden
, type help fkden
.
See important warnings about boundary correction approaches in
dbckden
, type help bckden
.
Acknowledgments
See Acknowledgments in
fnormgpd
, type help fnormgpd
. Based on code
by Anna MacDonald produced for MATLAB.
Note
See notes in fnormgpd
for details, type help fnormgpd
.
Only the different features are outlined below for brevity.
No default initial values for parameter vector are provided, so will stop evaluation if
pvector
is left as NULL
. Avoid setting the starting value for the shape parameter to
xi=0
as depending on the optimisation method it may be get stuck.
The data and kernel centres are both vectors. Infinite, missing and negative sample values (and kernel centres) are dropped.
Author(s)
Yang Hu and Carl Scarrott carl.scarrott@canterbury.ac.nz
References
http://www.math.canterbury.ac.nz/~c.scarrott/evmix
http://en.wikipedia.org/wiki/Kernel_density_estimation
http://en.wikipedia.org/wiki/Cross-validation_(statistics)
http://en.wikipedia.org/wiki/Generalized_Pareto_distribution
Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf
Hu, Y. (2013). Extreme value mixture modelling: An R package and simulation study. MSc (Hons) thesis, University of Canterbury, New Zealand. http://ir.canterbury.ac.nz/simple-search?query=extreme&submit=Go
Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2), 353-360.
Duin, R.P.W. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers C25(11), 1175-1179.
MacDonald, A., Scarrott, C.J., Lee, D., Darlow, B., Reale, M. and Russell, G. (2011). A flexible extreme value mixture model. Computational Statistics and Data Analysis 55(6), 2137-2157.
MacDonald, A., C. J. Scarrott, and D. S. Lee (2011). Boundary correction, consistency and robustness of kernel densities using extreme value theory. Submitted. Available from: http://www.math.canterbury.ac.nz/~c.scarrott.
Wand, M. and Jones, M.C. (1995). Kernel Smoothing. Chapman && Hall.
See Also
kernels
, kfun
,
density
, bw.nrd0
and dkde
in ks
package.
fgpd
and gpd
.
Other kdengpdcon: bckdengpdcon
,
fgkgcon
, fkdengpdcon
,
fkdengpd
, gkgcon
,
kdengpdcon
, kdengpd
Other bckden: bckdengpdcon
,
bckdengpd
, bckden
,
fbckdengpd
, fbckden
,
fkden
, kden
Other bckdengpd: bckdengpdcon
,
bckdengpd
, bckden
,
fbckdengpd
, fbckden
,
fkdengpd
, gkg
,
kdengpd
, kden
Other bckdengpdcon: bckdengpdcon
,
bckdengpd
, bckden
,
fbckdengpd
, fbckden
,
fkdengpdcon
, gkgcon
,
kdengpdcon
Other fbckdengpdcon: bckdengpdcon
Examples
## Not run:
set.seed(1)
par(mfrow = c(2, 1))
x = rgamma(500, 2, 1)
xx = seq(-0.1, 10, 0.01)
y = dgamma(xx, 2, 1)
# Continuity constraint
pinit = c(0.1, quantile(x, 0.9), 0.1) # initial values required for BCKDE
fit = fbckdengpdcon(x, pvector = pinit, bcmethod = "cutnorm")
hist(x, breaks = 100, freq = FALSE, xlim = c(-0.1, 10))
lines(xx, y)
with(fit, lines(xx, dbckdengpdcon(xx, x, lambda, u, xi, bcmethod = "cutnorm"), col="red"))
abline(v = fit$u, col = "red")
# No continuity constraint
pinit = c(0.1, quantile(x, 0.9), 1, 0.1) # initial values required for BCKDE
fit2 = fbckdengpd(x, pvector = pinit, bcmethod = "cutnorm")
with(fit2, lines(xx, dbckdengpd(xx, x, lambda, u, sigmau, xi, bc = "cutnorm"), col="blue"))
abline(v = fit2$u, col = "blue")
legend("topright", c("True Density","No continuity constraint","With continuty constraint"),
col=c("black", "blue", "red"), lty = 1)
# Profile likelihood for initial value of threshold and fixed threshold approach
pinit = c(0.1, 0.1) # notice threshold dropped from initial values
fitu = fbckdengpdcon(x, useq = seq(1, 6, length = 20), pvector = pinit, bcmethod = "cutnorm")
fitfix = fbckdengpdcon(x, useq = seq(1, 6, length = 20), fixedu = TRUE, pv = pinit, bc = "cutnorm")
hist(x, breaks = 100, freq = FALSE, xlim = c(-0.1, 10))
lines(xx, y)
with(fit, lines(xx, dbckdengpdcon(xx, x, lambda, u, xi, bc = "cutnorm"), col="red"))
abline(v = fit$u, col = "red")
with(fitu, lines(xx, dbckdengpdcon(xx, x, lambda, u, xi, bc = "cutnorm"), col="purple"))
abline(v = fitu$u, col = "purple")
with(fitfix, lines(xx, dbckdengpdcon(xx, x, lambda, u, xi, bc = "cutnorm"), col="darkgreen"))
abline(v = fitfix$u, col = "darkgreen")
legend("topright", c("True Density","Default initial value (90% quantile)",
"Prof. lik. for initial value", "Prof. lik. for fixed threshold"),
col=c("black", "red", "purple", "darkgreen"), lty = 1)
## End(Not run)