fgkgcon {evmix} | R Documentation |
MLE Fitting of Kernel Density Estimate for Bulk and GPD for Both Tails with Single Continuity Constraint at Both Thresholds Extreme Value Mixture Model
Description
Maximum likelihood estimation for fitting the extreme value mixture model with kernel density estimate for bulk distribution between thresholds and conditional GPDs for both tails with continuity at thresholds. With options for profile likelihood estimation for both thresholds and fixed threshold approach.
Usage
fgkgcon(x, phiul = TRUE, phiur = TRUE, ulseq = NULL, urseq = NULL,
fixedu = FALSE, pvector = NULL, kernel = "gaussian",
add.jitter = FALSE, factor = 0.1, amount = NULL, std.err = TRUE,
method = "BFGS", control = list(maxit = 10000), finitelik = TRUE,
...)
lgkgcon(x, lambda = NULL, ul = 0, xil = 0, phiul = TRUE, ur = 0,
xir = 0, phiur = TRUE, bw = NULL, kernel = "gaussian",
log = TRUE)
nlgkgcon(pvector, x, phiul = TRUE, phiur = TRUE, kernel = "gaussian",
finitelik = FALSE)
proflugkgcon(ulr, pvector, x, phiul = TRUE, phiur = TRUE,
kernel = "gaussian", method = "BFGS", control = list(maxit =
10000), finitelik = TRUE, ...)
nlugkgcon(pvector, ul, ur, x, phiul = TRUE, phiur = TRUE,
kernel = "gaussian", finitelik = FALSE)
Arguments
x |
vector of sample data |
phiul |
probability of being below lower threshold |
phiur |
probability of being above upper threshold |
ulseq |
vector of lower thresholds (or scalar) to be considered in profile likelihood or
|
urseq |
vector of upper thresholds (or scalar) to be considered in profile likelihood or
|
fixedu |
logical, should threshold be fixed (at either scalar value in |
pvector |
vector of initial values of parameters or |
kernel |
kernel name ( |
add.jitter |
logical, whether jitter is needed for rounded kernel centres |
factor |
see |
amount |
see |
std.err |
logical, should standard errors be calculated |
method |
optimisation method (see |
control |
optimisation control list (see |
finitelik |
logical, should log-likelihood return finite value for invalid parameters |
... |
optional inputs passed to |
lambda |
scalar bandwidth for kernel (as half-width of kernel) |
ul |
scalar lower tail threshold |
xil |
scalar lower tail GPD shape parameter |
ur |
scalar upper tail threshold |
xir |
scalar upper tail GPD shape parameter |
bw |
scalar bandwidth for kernel (as standard deviations of kernel) |
log |
logical, if |
ulr |
vector of length 2 giving lower and upper tail thresholds or
|
Details
The extreme value mixture model with kernel density estimate for bulk and GPD for both tails with continuity at thresholds is fitted to the entire dataset using maximum likelihood estimation. The estimated parameters, variance-covariance matrix and their standard errors are automatically output.
See help for fnormgpd
and fgng
for details, type help fnormgpd
and help fgng
.
Only the different features are outlined below for brevity.
The GPD sigmaul
and sigmaur
parameters are now specified as function of
other parameters, see
help for dgkgcon
for details, type help gkgcon
.
Therefore, sigmaul
and sigmaur
should not be included in the parameter
vector if initial values are provided, making the full parameter vector
The full parameter vector is
(lambda
, ul
, xil
, ur
, xir
)
if thresholds are also estimated and
(lambda
, xil
, xir
)
for profile likelihood or fixed threshold approach.
Cross-validation likelihood is used for KDE, but standard likelihood is used
for GPD components. See help for fkden
for details,
type help fkden
.
The alternate bandwidth definitions are discussed in the
kernels
, with the lambda
as the default
used in the likelihood fitting. The bw
specification is the same as
used in the density
function.
The possible kernels are also defined in kernels
with the "gaussian"
as the default choice.
The tail fractions phiul
and phiur
are treated separately to the other parameters,
to allow for all their representations. In the fitting functions
fgkgcon
and
proflugkgcon
they are logical:
default values
phiul=TRUE
andphiur=TRUE
- tail fractions specified by KDE distribution and survivior functions respectively and standard error is output asNA
.-
phiul=FALSE
andphiur=FALSE
- treated as extra parameters estimated using the MLE which is the sample proportion beyond the thresholds and standard error is output.
In the likelihood functions lgkgcon
,
nlgkgcon
and nlugkgcon
it can be logical or numeric:
logical - same as for fitting functions with default values
phiul=TRUE
andphiur=TRUE
.numeric - any value over range
(0, 1)
. Notice that the tail fraction probability cannot be 0 or 1 otherwise there would be no contribution from either tail or bulk components respectively. Also,phiul+phiur<1
as bulk must contribute.
If the profile likelihood approach is used, then a grid search over all combinations of both thresholds is carried out. The combinations which lead to less than 5 in any datapoints beyond the thresholds are not considered.
Value
Log-likelihood is given by lgkgcon
and it's
wrappers for negative log-likelihood from nlgkgcon
and nlugkgcon
. Profile likelihood for both
thresholds given by proflugkgcon
. Fitting function
fgkgcon
returns a simple list with the
following elements
call : | optim call |
x : | data vector x |
init : | pvector |
fixedu : | fixed thresholds, logical |
ulseq : | lower threshold vector for profile likelihood or scalar for fixed threshold |
urseq : | upper threshold vector for profile likelihood or scalar for fixed threshold |
nllhuseq : | profile negative log-likelihood at each threshold pair in (ulseq, urseq) |
optim : | complete optim output |
mle : | vector of MLE of parameters |
cov : | variance-covariance matrix of MLE of parameters |
se : | vector of standard errors of MLE of parameters |
rate : | phiu to be consistent with evd |
nllh : | minimum negative log-likelihood |
n : | total sample size |
lambda : | MLE of lambda (kernel half-width) |
ul : | lower threshold (fixed or MLE) |
sigmaul : | MLE of lower tail GPD scale (estimated from other parameters) |
xil : | MLE of lower tail GPD shape |
phiul : | MLE of lower tail fraction (bulk model or parameterised approach) |
se.phiul : | standard error of MLE of lower tail fraction |
ur : | upper threshold (fixed or MLE) |
sigmaur : | MLE of upper tail GPD scale (estimated from other parameters) |
xir : | MLE of upper tail GPD shape |
phiur : | MLE of upper tail fraction (bulk model or parameterised approach) |
se.phiur : | standard error of MLE of lower tail fraction |
bw : | MLE of bw (kernel standard deviations) |
kernel : | kernel name |
Warning
See important warnings about cross-validation likelihood estimation in
fkden
, type help fkden
.
Acknowledgments
See Acknowledgments in
fnormgpd
, type help fnormgpd
. Based on code
by Anna MacDonald produced for MATLAB.
Note
The data and kernel centres are both vectors. Infinite and missing sample values (and kernel centres) are dropped.
When pvector=NULL
then the initial values are:
normal reference rule for bandwidth, using the
bw.nrd0
function, which is consistent with thedensity
function. At least two kernel centres must be provided as the variance needs to be estimated.lower threshold 10% quantile (not relevant for profile likelihood for threshold or fixed threshold approaches);
upper threshold 90% quantile (not relevant for profile likelihood for threshold or fixed threshold approaches);
MLE of GPD shape parameters beyond thresholds.
Author(s)
Yang Hu and Carl Scarrott carl.scarrott@canterbury.ac.nz
References
http://www.math.canterbury.ac.nz/~c.scarrott/evmix
http://en.wikipedia.org/wiki/Kernel_density_estimation
http://en.wikipedia.org/wiki/Cross-validation_(statistics)
http://en.wikipedia.org/wiki/Generalized_Pareto_distribution
Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf
Hu, Y. (2013). Extreme value mixture modelling: An R package and simulation study. MSc (Hons) thesis, University of Canterbury, New Zealand. http://ir.canterbury.ac.nz/simple-search?query=extreme&submit=Go
Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2), 353-360.
Duin, R.P.W. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers C25(11), 1175-1179.
MacDonald, A., Scarrott, C.J., Lee, D., Darlow, B., Reale, M. and Russell, G. (2011). A flexible extreme value mixture model. Computational Statistics and Data Analysis 55(6), 2137-2157.
Wand, M. and Jones, M.C. (1995). Kernel Smoothing. Chapman && Hall.
See Also
kernels
, kfun
,
density
, bw.nrd0
and dkde
in ks
package.
fgpd
and gpd
.
Other kden: bckden
, fbckden
,
fgkg
, fkdengpdcon
,
fkdengpd
, fkden
,
kdengpdcon
, kdengpd
,
kden
Other kdengpdcon: bckdengpdcon
,
fbckdengpdcon
, fkdengpdcon
,
fkdengpd
, gkgcon
,
kdengpdcon
, kdengpd
Other gkg: fgkg
, fkdengpd
,
gkgcon
, gkg
,
kdengpd
, kden
Other gkgcon: fgkg
,
fkdengpdcon
, gkgcon
,
gkg
, kdengpdcon
Other fgkgcon: gkgcon
Examples
## Not run:
set.seed(1)
par(mfrow = c(2, 1))
x = rnorm(1000)
xx = seq(-4, 4, 0.01)
y = dnorm(xx)
# Continuity constraint
fit = fgkgcon(x)
hist(x, breaks = 100, freq = FALSE, xlim = c(-4, 4))
lines(xx, y)
with(fit, lines(xx, dgkgcon(xx, x, lambda, ul, xil, phiul,
ur, xir, phiur), col="red"))
abline(v = c(fit$ul, fit$ur), col = "red")
# No continuity constraint
fit2 = fgkg(x)
with(fit2, lines(xx, dgkg(xx, x, lambda, ul, sigmaul, xil, phiul,
ur, sigmaur, xir, phiur), col="blue"))
abline(v = c(fit2$ul, fit2$ur), col = "blue")
legend("topleft", c("True Density","No continuity constraint","With continuty constraint"),
col=c("black", "blue", "red"), lty = 1)
# Profile likelihood for initial value of threshold and fixed threshold approach
fitu = fgkgcon(x, ulseq = seq(-2, -0.2, length = 10),
urseq = seq(0.2, 2, length = 10))
fitfix = fgkgcon(x, ulseq = seq(-2, -0.2, length = 10),
urseq = seq(0.2, 2, length = 10), fixedu = TRUE)
hist(x, breaks = 100, freq = FALSE, xlim = c(-4, 4))
lines(xx, y)
with(fit, lines(xx, dgkgcon(xx, x, lambda, ul, xil, phiul,
ur, xir, phiur), col="red"))
abline(v = c(fit$ul, fit$ur), col = "red")
with(fitu, lines(xx, dgkgcon(xx, x, lambda, ul, xil, phiul,
ur, xir, phiur), col="purple"))
abline(v = c(fitu$ul, fitu$ur), col = "purple")
with(fitfix, lines(xx, dgkgcon(xx, x, lambda, ul, xil, phiul,
ur, xir, phiur), col="darkgreen"))
abline(v = c(fitfix$ul, fitfix$ur), col = "darkgreen")
legend("topright", c("True Density","Default initial value (90% quantile)",
"Prof. lik. for initial value", "Prof. lik. for fixed threshold"),
col=c("black", "red", "purple", "darkgreen"), lty = 1)
## End(Not run)