kdengpdcon {evmix} | R Documentation |
Kernel Density Estimate and GPD Tail Extreme Value Mixture Model With Single Continuity Constraint
Description
Density, cumulative distribution function, quantile function and
random number generation for the extreme value mixture model with kernel density
estimate for bulk distribution upto the threshold and conditional GPD above threshold
with continuity at threshold. The parameters
are the bandwidth lambda
, threshold u
GPD shape xi
and tail fraction phiu
.
Usage
dkdengpdcon(x, kerncentres, lambda = NULL,
u = as.vector(quantile(kerncentres, 0.9)), xi = 0, phiu = TRUE,
bw = NULL, kernel = "gaussian", log = FALSE)
pkdengpdcon(q, kerncentres, lambda = NULL,
u = as.vector(quantile(kerncentres, 0.9)), xi = 0, phiu = TRUE,
bw = NULL, kernel = "gaussian", lower.tail = TRUE)
qkdengpdcon(p, kerncentres, lambda = NULL,
u = as.vector(quantile(kerncentres, 0.9)), xi = 0, phiu = TRUE,
bw = NULL, kernel = "gaussian", lower.tail = TRUE)
rkdengpdcon(n = 1, kerncentres, lambda = NULL,
u = as.vector(quantile(kerncentres, 0.9)), xi = 0, phiu = TRUE,
bw = NULL, kernel = "gaussian")
Arguments
x |
quantiles |
kerncentres |
kernel centres (typically sample data vector or scalar) |
lambda |
bandwidth for kernel (as half-width of kernel) or |
u |
threshold |
xi |
shape parameter |
phiu |
probability of being above threshold |
bw |
bandwidth for kernel (as standard deviations of kernel) or |
kernel |
kernel name ( |
log |
logical, if TRUE then log density |
q |
quantiles |
lower.tail |
logical, if FALSE then upper tail probabilities |
p |
cumulative probabilities |
n |
sample size (positive integer) |
Details
Extreme value mixture model combining kernel density estimate (KDE) for the bulk below the threshold and GPD for upper tail with continuity at threshold.
The user can pre-specify phiu
permitting a parameterised value for the tail fraction \phi_u
. Alternatively, when
phiu=TRUE
the tail fraction is estimated as the tail fraction from the
KDE bulk model.
The alternate bandwidth definitions are discussed in the
kernels
, with the lambda
as the default.
The bw
specification is the same as used in the
density
function.
The possible kernels are also defined in kernels
with the "gaussian"
as the default choice.
The cumulative distribution function with tail fraction \phi_u
defined by the
upper tail fraction of the kernel density estimate (phiu=TRUE
), upto the
threshold x \le u
, given by:
F(x) = H(x)
and above the threshold x > u
:
F(x) = H(u) + [1 - H(u)] G(x)
where H(x)
and G(X)
are the KDE and conditional GPD
cumulative distribution functions respectively.
The cumulative distribution function for pre-specified \phi_u
, upto the
threshold x \le u
, is given by:
F(x) = (1 - \phi_u) H(x)/H(u)
and above the threshold x > u
:
F(x) = \phi_u + [1 - \phi_u] G(x)
Notice that these definitions are equivalent when \phi_u = 1 - H(u)
.
The continuity constraint means that (1 - \phi_u) h(u)/H(u) = \phi_u g(u)
where h(x)
and g(x)
are the KDE and conditional GPD
density functions respectively. The resulting GPD scale parameter is then:
\sigma_u = \phi_u H(u) / [1 - \phi_u] h(u)
. In the special case of where the tail fraction is defined by the bulk model this reduces to
\sigma_u = [1 - H(u)] / h(u)
.
If no bandwidth is provided lambda=NULL
and bw=NULL
then the normal
reference rule is used, using the bw.nrd0
function, which is
consistent with the density
function. At least two kernel
centres must be provided as the variance needs to be estimated.
See gpd
for details of GPD upper tail component and
dkden
for details of KDE bulk component.
Value
dkdengpdcon
gives the density,
pkdengpdcon
gives the cumulative distribution function,
qkdengpdcon
gives the quantile function and
rkdengpdcon
gives a random sample.
Acknowledgments
Based on code by Anna MacDonald produced for MATLAB.
Note
Unlike most of the other extreme value mixture model functions the
kdengpdcon
functions have not been vectorised as
this is not appropriate. The main inputs (x
, p
or q
)
must be either a scalar or a vector, which also define the output length.
The kerncentres
can also be a scalar or vector.
The kernel centres kerncentres
can either be a single datapoint or a vector
of data. The kernel centres (kerncentres
) and locations to evaluate density (x
)
and cumulative distribution function (q
) would usually be different.
Default values are provided for all inputs, except for the fundamentals
kerncentres
, x
, q
and p
. The default sample size for
rkdengpdcon
is 1.
Missing (NA
) and Not-a-Number (NaN
) values in x
,
p
and q
are passed through as is and infinite values are set to
NA
. None of these are not permitted for the parameters or kernel centres.
Due to symmetry, the lower tail can be described by GPD by negating the quantiles.
Error checking of the inputs (e.g. invalid probabilities) is carried out and will either stop or give warning message as appropriate.
Author(s)
Yang Hu and Carl Scarrott carl.scarrott@canterbury.ac.nz.
References
http://en.wikipedia.org/wiki/Kernel_density_estimation
http://en.wikipedia.org/wiki/Generalized_Pareto_distribution
Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf
Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2), 353-360.
Duin, R.P.W. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers C25(11), 1175-1179.
MacDonald, A., Scarrott, C.J., Lee, D., Darlow, B., Reale, M. and Russell, G. (2011). A flexible extreme value mixture model. Computational Statistics and Data Analysis 55(6), 2137-2157.
Wand, M. and Jones, M.C. (1995). Kernel Smoothing. Chapman && Hall.
See Also
kernels
, kfun
,
density
, bw.nrd0
and dkde
in ks
package.
Other kden: bckden
, fbckden
,
fgkgcon
, fgkg
,
fkdengpdcon
, fkdengpd
,
fkden
, kdengpd
,
kden
Other kdengpd: bckdengpd
,
fbckdengpd
, fgkg
,
fkdengpdcon
, fkdengpd
,
fkden
, gkg
,
kdengpd
, kden
Other kdengpdcon: bckdengpdcon
,
fbckdengpdcon
, fgkgcon
,
fkdengpdcon
, fkdengpd
,
gkgcon
, kdengpd
Other gkgcon: fgkgcon
, fgkg
,
fkdengpdcon
, gkgcon
,
gkg
Other bckdengpdcon: bckdengpdcon
,
bckdengpd
, bckden
,
fbckdengpdcon
, fbckdengpd
,
fbckden
, fkdengpdcon
,
gkgcon
Other fkdengpdcon: fkdengpdcon
Examples
## Not run:
set.seed(1)
par(mfrow = c(2, 2))
kerncentres=rnorm(500, 0, 1)
xx = seq(-4, 4, 0.01)
hist(kerncentres, breaks = 100, freq = FALSE)
lines(xx, dkdengpdcon(xx, kerncentres, u = 1.2, xi = 0.1))
plot(xx, pkdengpdcon(xx, kerncentres), type = "l")
lines(xx, pkdengpdcon(xx, kerncentres, xi = 0.3), col = "red")
lines(xx, pkdengpdcon(xx, kerncentres, xi = -0.3), col = "blue")
legend("topleft", paste("xi =",c(0, 0.3, -0.3)),
col=c("black", "red", "blue"), lty = 1, cex = 0.5)
x = rkdengpdcon(1000, kerncentres, phiu = 0.2, u = 1, xi = 0.2)
xx = seq(-4, 6, 0.01)
hist(x, breaks = 100, freq = FALSE, xlim = c(-4, 6))
lines(xx, dkdengpdcon(xx, kerncentres, phiu = 0.2, u = 1, xi = -0.1))
plot(xx, dkdengpdcon(xx, kerncentres, xi=0, u = 1, phiu = 0.2), type = "l")
lines(xx, dkdengpdcon(xx, kerncentres, xi=0.2, u = 1, phiu = 0.2), col = "red")
lines(xx, dkdengpdcon(xx, kerncentres, xi=-0.2, u = 1, phiu = 0.2), col = "blue")
legend("topleft", c("xi = 0", "xi = 0.2", "xi = -0.2"),
col=c("black", "red", "blue"), lty = 1)
## End(Not run)