kfuncCOP {copBasic} | R Documentation |
The Kendall (Distribution) Function of a Copula
Description
To begin, there are at least three terms in the literature for what appear as the same function supported by the kfuncCOP
function. The Kendall Function also is known as Kendall Distribution Function (Nelsen, 2006, p. 163) and Kendall Measure (Salvadori et al., 2007, p. 148). Each of these is dealt with in sequel to set the manner of the rather lengthy documentation for this function.
KENDALL FUNCTION—The Kendall Function () (Joe, 2014, pp. 419–422) is the cumulative distribution function (CDF) of the vector
or
(bivariate) where
is distributed as the copula:
. Letting
be the random variable for
, the Kendall Function is defined as
where is the nonexceedance probability of the joint probability
stemming from the
. Note, unlike its univariate counterpart,
is rarely uniformly distributed (Nelsen et al., 2001, p. 278). The inverse
is implemented by the
kfuncCOPinv
function, which could be used for simulation of the correct joint probability using a single unformly distributed U(0,1) random variable. A reminder is needed that
is the joint probability and
is the Kendall Function.
Joe (2014) and others as cited list various special cases of , inequalities, and some useful identities suitable for validation study:
For
(see
M
): for all
for all
dimensions;
For
(see
W
): for all
for
(bivariate only);
For
(see
P
): for
for
(bivariate only);
For any
:
for
; and
For any
:
(Nelsen, 2001, p. 281) — Z expectation, not
!
For any
:
(Nelsen, 2006, p. 163; see
tauCOP
[Examples]).
For any
:
does not uniquely determine the copula.
The last item is from Durante and Sempi (2015, p. 118), and later discussion herein will concern an example of theirs. By coincidence within a few days before receipt of the Durante and Sempi book, experiments using kfuncCOP
suggested that numerically the Galambos (GLcop
), Gumbel–Hougaard (GHcop
), and Hüsler–Reiss (HRcop
) extreme value copulas for the same Kendall Tau () all have the same
. Therefore, do all EV-copulas have the same Kendall Function? Well in fact, they do and Durante and Sempi (2015, p. 207) show that
for an EV-copula.
Joe (2014, p. 420) also indicates that strength of lower-tail dependence (taildepCOP
) affects as
, whereas strength of upper-tail dependence affects
as
. (A demonstration of tail dependence dependence is made in section Note.) Also compared to comonotonicity copula [
] there are no countermonotonicity copula (
) for dimensions greater the bivariate (Joe, 2014, p. 214)
Joe (2014) does not explicitly list an expression of that is computable directly for any
, and Nelsen (2006, p. 163) only lists a form (see later in documentation) for Archimedean copulas. Salvadori et al. (2007, eq. 3.47, p. 147) also list the Archimedean form; however, Salvadori et al. (2007, eq. 3.49, p. 148) also list a form computable directly for any
. Considerable numerical experiments and derivations involving the
copula and results for
shown later, indicate that the correct Kendall form for any
is
where for
,
can be computed by the
COPinv
function, and the partial derivative can be computed by the
derCOP
function. It is a curiosity that this form is not in Joe (2014), Nelsen et al. (2001, 2003), or Nelsen (2006), but actually in Salvadori et al. (2007).
KENDALL MEASURE—The actual expression for any by Salvadori et al. (2007, eq. 3.49, p. 148) is for Kendall Measure (
) of a copula:
where for
. Those authors report that
is the CDF of a random variable
whose distribution is
. This is clearly appears to be the same meaning as Joe (2014) and Nelsen (2006). The minus “
” in the above equation is very important.
Salvadori et al. (2007, p. 148) report that “the function represents a fundamental tool for calculating the return period of extreme events.” The complement of
is
, and the
inverse
is referred to as a secondary return period (Salvadori et al., 2007, pp. 161–170).
KENDALL DISTRIBUTION FUNCTION—Nelsen (2006, p. 163) defines the Kendall Distribution Function (say ) as
where is a generator function of an Archimedean copula and
is a one-sided derivative (Nelsen, 2006, p. 125), and
is
. This same form is listed by Salvadori et al. (2007, eq. 3.47).
Nelsen (2006) does not seem to list a more general definition for any . Because there is considerable support for Archimedean copulas in R, copBasic has deliberately been kept from being yet another Archimedean-based package. This is made for more fundamental theory and pedogogic reasons without the algorithmic efficiency relative to the many convenient properties of Archimedean copulas.
The similarity of ,
, and
, however, is obvious—research shows that there are no syntatic differences between
and
and
—they all are the CDF of the joint probability
of the copula. Consider now that Salvadori et al. show
having the form
and not a form
as previously shown for
. Which form is thus correct? The greater bulk of this documentation seeks to answer that question, and it must be concluded that Salvadori et al. (2007, eq. 3.49) definition for
has a typesetting error.
Usage
kfuncCOP(z, cop=NULL, para=NULL, wrtV=FALSE, as.sample=FALSE,
verbose=FALSE, subdivisions=100L,
rel.tol=.Machine$double.eps^0.25, abs.tol=rel.tol, ...)
kmeasCOP(z, cop=NULL, para=NULL, wrtV=FALSE, as.sample=FALSE,
verbose=FALSE, subdivisions=100L,
rel.tol=.Machine$double.eps^0.25, abs.tol=rel.tol, ...)
Arguments
z |
The values for |
cop |
A copula function; |
para |
Vector of parameters or other data structure, if needed, to pass to the copula; |
wrtV |
A logical to toggle between with respect to |
as.sample |
A control on whether an optional R |
verbose |
A logical supressing warnings from |
subdivisions |
Argument of same name passed to |
rel.tol |
Argument of same name passed to |
abs.tol |
Argument of same name passed to |
... |
Additional arguments to pass. |
Value
The value(s) for is returned.
Note
VALIDATION STUDY—A validation study using the Independence copula (
;
P
) with theoretical results of Joe (2014) and empiricism is readily performed using the expression for :
Z <- sort(c(0.01, seq(0,1, by=0.05), 0.99)) # ** Joint probabilities ** UV <- simCOP(n=4000, cop=P, graphics=FALSE); kendF <- Z - Z*log(Z) emp.kendF <- kfuncCOP(Z, para=UV, as.sample="genest") # emp. Kendall func theo.kendF <- kfuncCOP(Z, cop=P) # theo. Kendall func, numeric integration plot(Z, kendF, type="l", col=3, lwd=4, lty=2, xlim=c(0,1), ylim=c(0,1), xlab="COPULA(u,v) VALUE [JOINT PROBABILITY]", ylab="KENDALL FUNCTION, AS NONEXCEEDANCE PROBABILITY") # analytical points(Z, kendF, col=3, lwd=1, lty=2, pch=16) # green (theo. values) points(Z, emp.kendF, col=4, lwd=2, cex=1.5) # blue (empirical values) lines(Z, theo.kendF, col=2) # red (theoretical line by numerical integration) mtext("Kendall Functions: Independence Copula")
The figure so produced shows that the theoretical relation in Joe (2014) is correct because the empirical values from the simulated sample (Empirical Kendall Function; Nelsen et al., 2003) match other curves quite well. Rerunning the above code with either (
M
) or (
W
) copulas will show that the special cases listed above are consistent with the empirical estimates. The case of is degenerate at
so the empirical computation is in error for the smallest
given because of interpolation. The
copula has
along the equal value line (1:1) line.
Now consider a more comprehensive demonstration using the N4212cop
copula with some relatively weak dependence in .
Theta <- 1.17; print(rhoCOP(cop=N4212cop, para=Theta)) # Spearman Rho = 6/10 Z <- sort(c(0.01, seq(0,1, by=0.05), 0.99)) # ** Joint probabilities ** UV <- simCOP(n=16000, cop=N4212cop, para=Theta, graphics=TRUE) empir.kendF <- kfuncCOP(Z, as.sample=TRUE, para=UV, ctype="weibull") kwrtU <- kfuncCOP(Z, cop=N4212cop, para=Theta, wrtV=FALSE) kwrtV <- kfuncCOP(Z, cop=N4212cop, para=Theta, wrtV=TRUE ) plot(Z, empir.kendF, type="p", col=2, lwd=7, lty=2, xlim=c(0,1),ylim=c(0,1), xlab="COPULA(u,v) VALUE [JOINT PROBABILITY]", ylab="KENDALL FUNCTION, AS NONEXCEEDANCE PROBABILITY") abline(0,1, lty=2, lwd=0.8); mtext("Kendall Functions: N4212(1.17) Copula") lines(Z, kwrtU, col=4, lwd=4, lty=2); lines(Z, kwrtV, col=3, lwd=1, lty=2)
The figure so produced again shows congruence between the two theoretical computations and the empirical curve. Now consider another comprehensive demonstration using the PLACKETTcop
copula with some strong negative dependence in .
Theta <- 0.04 Z <- sort(c(0.01, seq(0,1, by=0.05), 0.99)) # ** Joint probabilities ** UV <- simCOP(n=2600, cop=PLACKETTcop, para=Theta, graphics=TRUE) empir.kendF <- kfuncCOP(Z, as.sample="hazen", para=UV) kwrtU <- kfuncCOP(Z, cop=PLACKETTcop, para=Theta, wrtV=FALSE) kwrtV <- kfuncCOP(Z, cop=PLACKETTcop, para=Theta, wrtV=TRUE ) plot(Z, empir.kendF, type="p", col=2, lwd=7, lty=2, xlim=c(0,1),ylim=c(0,1), xlab="COPULA(u,v) VALUE [JOINT PROBABILITY]", ylab="KENDALL FUNCTION, AS NONEXCEEDANCE PROBABILITY") abline(0,1, lty=2, lwd=0.8); mtext("Kendall Function: Plackett Copula") lines(Z, kwrtU, col=4, lwd=4, lty=2); lines(Z, kwrtV, col=3, lwd=1, lty=2)
The figure so produced again shows congruence between the two theoretical computations and the empirical curve.
Another comparison of is useful and concerns lower- and upper-tail dependency parameters (
taildepCOP
) with a comparison of three different copula all having the same Kendall Tau. The following code computes and draws the respective :
# Given a Kendall Tau of 0.4 and the GHcop, N4212, and Plackett copulas # parameters respectively are: Phi <- 1.666667; Nu <- 1.111119; Mu <- 6.60344 Z <- seq(0.005, 0.995, by=0.005) # ** Joint probabilities ** GHkenf <- kfuncCOP(Z, cop=GHcop, para=Phi, wrtV=FALSE) N4212kenf <- kfuncCOP(Z, cop=N4212cop, para=Nu, wrtV=FALSE) PLkenf <- kfuncCOP(Z, cop=PLACKETTcop, para=Mu, wrtV=FALSE) plot(qnorm(GHkenf), Z, type="l", col=1, lwd=2, xlim=c(-3,3), ylim=c(0,1), xlab="KENDALL FUNCTION, AS STANDARD NORMAL VARIATES", ylab="COPULA(u,v) VALUE, AS NONEXCEEDANCE PROBABILITY") # black curve lines(qnorm(N4212kenf), Z, col=2, lwd=2) # red curve lines(qnorm(PLkenf), Z, col=4, lwd=2) # blue curve
The red curve for the copula (
N4212cop
) is higher on the left, which shows the impact of its larger lower-tail dependency (), whereas the black curve for the
copula (
GHcop
) is similarly (about same magnitude) higher on the right, which shows the impact of its larger upper-tail dependency (). The blue curve for the
copula (
PLACKETTcop
) nearly overwrites on the left the curve, which is a reflection of both copulae having zero lower-tail dependencies (
). Finally, as anticipated by
, the curve on the right for the
is just slightly larger for the
because the
(a small difference however), and again on the right, the
curve is considerably smaller than the
because
.
Durante and Sempi (2015, p. 118) provide an example of two copula ( and
) having the same
. Let us check that out:
"C1" <- function(u,v, ...) { if (length(u) == 1) { u <- rep(u, length(v)) } else if (length(v) == 1) { v <- rep(v, length(u)) } sapply(1:length(u), function(i) { min(c(u[i], max(c(v[i]/2, u[i]+v[i]-1)))) }) } "C2" <- function(u,v, ...) { if (length(u) == 1) { u <- rep(u, length(v)) } else if (length(v) == 1) { v <- rep(v, length(u)) } sapply(1:length(u), function(i) { g <- 1/2 max(c(0, u[i]+v[i]-1, min(c(u[i], v[i]-g)), min(c(u[i]-g, v[i])))) }) } DSkf <- function(t) sapply(t, function(z) min(c(2*z, 1))) zs <- seq(0,1, by=.01); plot(zs, DSkf(zs), col=2, cex=3) # red dots (theory) lines(zs, kfuncCOP(zs, cop=C1), lwd=4, col=7) # thick yellow line lines(zs, kfuncCOP(zs, cop=C2), lwd=1, col=1) # thin black line
The plot so produced shows indeed that the numerical operations by kfuncCOP
solidly work on these two strictly singular copulas and
that are different from the two singular
and
copulas. The
curves exactly matching the theoretical curve provided by Durante and Sempi are produced.
CONVERSATIONAL ASIDE—Interestingly, Durante and Sempi (2015, pp. 115–121), like other authors of major works cited herein, do not list a general expression for the as a function of any
. Those authors develop the idea of Kendall Function well and show results but for the author (Asquith) the jump based of Theorem 3.9.2 to an expression, such as shown above for
based on
, to an usable form for any
is difficult.
This is a fascinating topic because if the Kendall Function is a reasonably important component of copula theory, then why so much difficulty in finding a canonical reference? For this one piece, whereas so much of copBasic features are quite nomenclaturely clear in say Nelsen (2006) or Joe (2014) but somehow not for the Kendall Function.
Perhaps to the professional mathematicians, the descriptions (nomenclature) used in all but Salvadori et al. (2007) are clear to intended readers. But even Salvadori et al. seemingly show theirs in error—perhaps the author (Asquith) has missed something fundamental, but the validations shown in this documentation prove at least that kfuncCOP
does what it is supposed to be doing but perhaps for the wrong reasoning. Lastly, Durante and Sempi have an error in their expression for Kendall Tau as a function of (see
tauCOP
).
SECONDARY RETURN PERIOD—As for “Kendall Measure” (after the lead of Salvadori et al. [2007, pp. 161–170]), the following code shows for purposes of discussing secondary return period that is correct and once and for all as defined in this documentation
. The secondary return period (
) is the expected interarrival time between events exceeding a
-year joint probability either from
, from
, or both. Let us use the 100-year level (primary return period;
), the Gumbel–Hougaard copula (
GHcop
) where the choice of
is made to match discussion in
copBasic-package
and Salvadori et al. (2007, table 3.3, p. 166).
# Gumbel-Hougaard [Kendall Tau=0.67267 (Salvadori et al. [2007])] Tyear <- 100; ANEP <- 1-1/Tyear; nsim <- 30000; ix <- 1:nsim 1/(1-kfuncCOP(ANEP, cop=GHcop, para=3.055)) # 148.28 years BarT <- sapply(1:20, function(i) { UV <- simCOP(n=nsim, cop=GHcop, para=3.055, graphics=FALSE) nsim/sum(GHcop(UV$U, UV$V, para=3.055) > ANEP) }) message("# BarBarT=", round(mean(BarT), digits=2), " years and StdDev(BarT)=", round( sd(BarT), digits=2)," years") # BarBarT=149.61 years and StdDev(BarT)=11.12 years
The mean of 20 repeats of a large sample simulation run for demonstrates empirical results that closely approximate theory
, and thus congruence is shown that the definition for
must be correct and that for
is incorrect. The table 3.3 in Salvadori et al. (2007) lists the secondary return period as 148.3 years, which obviously matches the output of
kfuncCOP
and empirical results shown.
Some additional details on secondary return period from Salvadori et al. (2007). Letting be some critical joint exceedance probability level,
be the complement of
, those authors (p. 166) name super-critical events having
. Thus, the secondary return period (
) must be greater than the primary return period (
), which is the case here (
) (Salvadori et al., 2007, p. 166).
Salvadori et al. (2007) provide considerable discussion of and some highlights are:
(p. 162) events equally or exceeding probability
or having return intervals
“represent [a] class of potentially dangerous events, the outliers, and [
can be used to] introduce an ad hoc return period for such destructive events.”
(p. 162) “primary return period [
] ... only takes into account the fact that a prescribed critical event is expected to appear once in a given time interval [
] ...
provides the exact probability that a potentially destructive event will happen at any given realization of
... and
gives the expected time for such an outlier to occur.”
(p. 164) “[
] only predicts that a critical event is expected to appear once in a given time interval ... would be more important to be able to calculate (1) the probability that a super-critical [sic.] event will occur at any given realization of [
], and (2) how long it takes, on average, for a super-critical event to appear.”
(p. 164) “the function
turns the difficult analysis of bivariate dynamics of
and
into a simpler one-dimensional problem.”
Author(s)
W.H. Asquith
Source
The comprehensive demonstrations are shown in the Note because of a sign convention and (or) probability convention incompatibility with the equation shown by Salvadori et al. (2007, p. 148). Initial source code development for copBasic was based on an hypothesis that the terms the “Kendall Function” and “Kendall Measure” might somehow have separate meanings—that the author must be blamed for misunderstanding the requisite nomenclature—this is evidently not true.
The as shown herein simply can not reproduce
for the
copula unless the “
” sign in the
equation is changed to a “
” to become the
definition as shown. The detective work needed for a valid function
kmeasCOP
was further complicated by fact that neither Durante and Sempi (2015), Joe (2014), Nelsen (2006), and others do not actually present a general equation for computation for any
.
Because of the subtle differences evidently between “Kendall functions” (lower case), an explict derivation for is informative to confirm what is meant by the Kendall Function as defined by
. Starting with
, then
substitution can now proceed:
which simplfies to
which matches the special case shown by Joe (2014) for the independence copula (;
P
). It is obvious thus that the “” is needed in the
definition in order to stay consistent with the basic form and integration limits shown by Salvadori et al. (2007) for
.
References
Durante, F., and Sempi, C., 2015, Principles of copula theory: Boca Raton, CRC Press, 315 p.
Genest, C., Quessy, J.F., Rémillard, B., 2006, Goodness-of-fit procedures for copula models based on the probability integral transformation: Scandinavian Journal of Statistics, v. 33, no. 2, pp. 337–366.
Joe, H., 2014, Dependence modeling with copulas: Boca Raton, CRC Press, 462 p.
Nelsen, R.B., 2006, An introduction to copulas: New York, Springer, 269 p.
Nelsen, R.B., Quesada-Molina, J.J., Rodríguez-Lallena, J.A., Úbeda-Flores, M., 2001, Distribution functions of copulas—A class of bivariate probability integral transforms: Statistics and Probability Letters, v. 54, no. 3, pp. 277–282.
Nelsen, R.B., Quesada-Molina, J.J., Rodríguez-Lallena, J.A., Úbeda-Flores, M., 2003, Kendall distribution functions: Statistics and Probability Letters, v. 65, no. 3, pp. 263–268.
Salvadori, G., De Michele, C., Kottegoda, N.T., and Rosso, R., 2007, Extremes in nature—An approach using copulas: Dordrecht, Netherlands, Springer, Water Science and Technology Library 56, 292 p.
See Also
kfuncCOPinv
, tauCOP
, derCOP
, derCOP2
, derCOPinv
, derCOPinv2
Examples
## Not run:
# Salvadori et al. (2007, p. 148, fig. 3.5 [right])
zs <- c(0.0001, seq(0.01, 1, by=0.01), 0.9999)
plot(zs, kmeasCOP(zs, cop=GHcop, para=5), log="y", type="l", lwd=4,
xlab="Z <= z", ylab="Kendall Function", xlim=c(0,1), ylim=c(0.001,1)) #
## End(Not run)