R: The Hoeffding Phi of a Copula or Lp Distances (Independence,...

hoefCOP {copBasic}

R Documentation

The Hoeffding Phi of a Copula or Lp Distances (Independence, Radial Asymmetry, or Reflection Symmetry Forms)

Description

Compute the measure of association known as the Hoeffding Phi \Phi_\mathbf{C} of a copula from independence (uv = \mathbf{\Pi}; P) according to Cherunbini et al. (2004, p. 164) by

\Phi_\mathbf{C} = 3 \sqrt{10\int\!\!\int_{\mathcal{I}^2} \bigl(\mathbf{C}(u,v) - uv\bigr)^2\,\mathrm{d}u\mathrm{d}v}\mbox{,}

and Nelsen (2006, p. 210) shows this as (and absolute value notation by Nelsen helps in generalization)

\Phi_\mathbf{C} = \biggl(90\int\!\!\int_{\mathcal{I}^2} |\mathbf{C}(u,v) - uv|^2\,\mathrm{d}u\mathrm{d}v\biggr)^{1/2}\mbox{,}

for which \Phi^2_\mathbf{C} (the square of the quantity) is known as the dependence index. Gaißer et al. (2010, eq. 1) have \Phi^2_\mathbf{C} as the Hoeffding Phi-Square, and their definition, when square-rooted, matches Nelsen's listing.

A generalization (Nelsen, 2006) to L_p distances from independence (uv = \mathbf{\Pi}; P) through the LpCOP function is

L_p \equiv \Phi_\mathbf{C}(p) = \biggl(k(p)\int\!\!\int_{\mathcal{I}^2} |\mathbf{C}(u,v) - uv|^p\,\mathrm{d}u\mathrm{d}v\biggr)^{1/p}\mbox{,}

for a p: 1 \le p \le \infty and where k(p) is a normalization constant such that \Phi_\mathbf{C}(p) = 1 when the copula \mathbf{C} is \mathbf{M} (see M) or \mathbf{W} (see W). The k(p) (bivariate definition only) for other powers is given (Nelsen, 2006, exer. 5.44, p. 213) in terms of the complete gamma function \Gamma(t) by

k(p) = \frac{\Gamma(2p+3)}{2[\Gamma(p + 1)]^2}\mbox{,}

which is implemented by the hoefCOP function. It is important to realize that the L_p distances are all symmetric nonparametric measures of dependence (Nelsen, 2006, p. 210). These are symmetric because distance from independence is used as evident by “uv” in the above definitions.

Reflection/Radial and Permutation Asymmetry—Asymmetric forms similar to the above distances exist. Joe (2014, p. 65) shows two measures of bivariate reflection asymmetry or radial asymmetry (term favored in copBasic) as the distance between \mathbf{C}(u,v) and the survival copula \hat{\mathbf{C}}(u,v) (surCOP) measured by

L_\infty^{\mathrm{radsym}} = \mathrm{sup}_{0\le u,v\le1}|\mathbf{C}(u,v) - \hat{\mathbf{C}}(u,v)|\mbox{,}

or its L_p^{\mathrm{radsym}} counterpart

L_p^{\mathrm{radsym}} = \biggl[\int\!\!\int_{\mathcal{I}^2} |\mathbf{C}(u,v) - \hat{\mathbf{C}}(u,v)|^p\,\mathrm{d}u\mathrm{d}v\biggr]^{1/p}\,\mathrm{with}\, p \ge 1\mbox{,}

where \hat{\mathbf{C}}(u,v) = u + v - 1 + \mathbf{C}(1-u, 1-v) and again p: 1 \le p \le \infty. Joe (2014) does not seem to discuss and normalization constants for these two radial asymmetry distances.

Joe (2014, p. 66) offers analogous measures of bivariate permutation asymmetry (isCOP.permsym) (\mathbf{C}(u,v) \not= \mathbf{C}(v,u)) defined as

L_\infty^{\mathrm{permsym}} = \mathrm{sup}_{0\le u,v\le1}|\mathbf{C}(u,v) - \hat{\mathbf{C}}(v,u)|\mbox{,}

or its L_p^{\mathrm{permsym}} counterpart

L_p^{\mathrm{permsym}} = \biggl[\int\!\!\int_{\mathcal{I}^2} |\mathbf{C}(u,v) - \hat{\mathbf{C}}(v,u)|^p\,\mathrm{d}u\mathrm{d}v\biggr]^{1/p}\,\mathrm{with}\, p \ge 1\mbox{,}

where p: 1 \le p \le \infty. Again, Joe (2014) does not seem to discuss and normalization constants for these two permutation symmetry distances. Joe (2014, p. 65) states that the “simplest one-parameter bivariate copula families [and] most of the commonly used two-parameter bivariate copula families are permutation symmetric.” The L_\infty^{\mathrm{permsym}} (or rather a similar form) is implemented by LzCOPpermsym and demonstration made in that documentation.

The asymmetrical L_\infty and L_p measures identified by Joe (2014, p. 66) are nonnegative with an upper bounds that depends on p. The bound dependence on p is caused by the lack of normalization constant k(p). In an earlier paragraph, Joe (2014) indicates an upper bounds of 1/3 for both (likely?) concerning L_\infty^{\mathrm{radsym}} and L_\infty^{\mathrm{permsym}}. Discussion of this 1/3 or rather the integer 3 is made within LzCOPpermsym.

The numerical integrations for L_p^{\mathrm{radsym}} and L_p^{\mathrm{permsym}} can readily return zeros. Often inspection of the formula for the \mathbf{C}(u,v) itself would be sufficient to judge whether symmetry exists and hence the distances are uniquely zero.

Joe (2014, p. 66) completes the asymmetry discussion with three definitions of skewness of combinations of random variables U and V: Two definitions are in uvlmoms (for U + V - 1 and U - V) and two are for V-U (nuskewCOP) and U+V-1 (nustarCOP).

Usage

hoefCOP(     cop=NULL, para=NULL, p=2, as.sample=FALSE,
                                       sample.as.prob=TRUE,
                                       brute=FALSE, delta=0.002, ...)

LpCOP(       cop=NULL, para=NULL, p=2, brute=FALSE, delta=0.002, ...)
LpCOPradsym( cop=NULL, para=NULL, p=2, brute=FALSE, delta=0.002, ...)
LpCOPpermsym(cop=NULL, para=NULL, p=2, brute=FALSE, delta=0.002, ...)

Arguments

`cop`	A copula function;
`para`	Vector of parameters or other data structure, if needed, to pass to the copula;
`p`	The value for `p` as described above with a default to 2 to match the discussion of Nelsen (2006) and the Hoeffding Phi of Cherubini et al. (2004). Do not confuse `p` with `d` described in Note;
`as.sample`	A logical controlling whether an optional R `data.frame` in `para` is used to compute the `\hat{\Phi}_\mathbf{C}` (see Note). If set to `-1`, then the message concerning CPU effort will be surpressed;
`sample.as.prob`	When `as.sample` triggered, what are the units incoming in `para`? If they are probabilities, the default is applicable. If they are not, then the columns are re-ranked and divided simply by `1/n`—more sophisticated empirical copula probabilities are not used (`EMPIRcop`);
`brute`	Should brute force be used instead of two nested `integrate()` functions in R to perform the double integration;
`delta`	The `\mathrm{d}u` and `\mathrm{d}v` for the brute force (`brute=TRUE`) integration; and
`...`	Additional arguments to pass.

Value

The value for \Phi_\mathbf{C}(p) is returned.

Note

Concerning the distance from independence, when p = 1, then the Spearman Rho (rhoCOP) of a copula is computed where is it seen in that documentation that the k_p(1) = 12. The respective values of k(p) for select integers p are

p \mapsto [1, 2, 3, 4, 5] \equiv k(p) \mapsto \{12, 90, 560, 3150, 16600\}\mbox{,}

and these values are hardwired into hoefCOP and LpCOP. The integers for k_p ensures that the equality in the second line of the examples is TRUE, but the p can be a noninteger as well. Nelsen (2006, p. 211) reports that when p = \infty that L_\infty is

L_\infty \equiv \Phi_\mathbf{C}(\infty) = \Lambda_\mathbf{C} = 4\;\mathrm{sup}_{u,v \in \mathcal{I}}|\mathbf{C}(u,v) - uv|\mbox{.}

A sample \hat{\Phi}_\mathbf{C} (square root of the Hoeffding Phi-Square) based on nonparametric estimation generalized for d dimensions (d = 2 for bivariate) is presented by Gaißer et al. (2010, eq. 10) for estimated probabilities \hat{U}_{ij} for the ith dimension and jth row (observation) for sample of size n. Those authors suggest that the \hat{U}_{ij} be estimated from the empirical copula:

\hat\Phi_\mathbf{C} = \sqrt{h(d)[A + B]}\mbox{,}

where

A = \biggl(\frac{1}{n}\biggr)^2\sum_{j=1}^n\sum_{k=1}^n\prod_{i=1}^d \bigl[1 - \mathrm{max}\bigl(\hat{U}_{ij}, \hat{U}_{ik}\bigr)\bigr]\mbox{,}

B = \biggl(\frac{1}{3}\biggr)^d - \biggl(\frac{2}{n}\biggr)\biggl(\frac{1}{2}\biggr)^d \sum_{j=1}^n\prod_{i=1}^d [1 - \hat{U}_{ij}^2]\mbox{.}

The normalization constant is a function of dimension and is

h(d)^{-1} = \frac{2}{(d+1)(d+2)} - \biggl(\frac{1}{2}\biggr)^d\frac{d\,!}{\prod_{i=0}^d\bigl(i+(1/2)\bigr)}+\biggl(\frac{1}{3}\biggr)^d\mbox{.}

  set.seed(1); UV <- simCOP(n=1000, cop=PSP)
  hoefCOP(cop=PSP)                                       # 0.4547656 (theo.)
  hoefCOP(para=UV, as.sample=TRUE)                       # 0.4892757
  set.seed(1); UV <- simCOP(n=1000, cop=PSP, snv=TRUE) # std normal variates
  hoefCOP(para=UV, as.sample=TRUE, sample.as.prob=FALSE) # 0.4270324

Author(s)

W.H. Asquith

References

Cherubini, U., Luciano, E., and Vecchiato, W., 2004, Copula methods in finance: Hoboken, NJ, Wiley, 293 p.

Gaißer, S., Ruppert, M., and Schmid, F., 2010, A multivariate version of Hoeffding's Phi-Square: Journal of Multivariate Analysis, v. 101, no. 10, pp. 2571–2586.

Joe, H., 2014, Dependence modeling with copulas: Boca Raton, CRC Press, 462 p.

Nelsen, R.B., 2006, An introduction to copulas: New York, Springer, 269 p.

Examples

## Not run: 
# Example (ii) Gaisser et al. (2010, p. 2574)
Theta <- 0.66 # Phi^2 = Theta^2 ---> Phi == Theta as shown
hoefCOP(cop=convex2COP, para=c(alpha=Theta, cop1=M, cop2=P)) # 0.6599886

rhoCOP(cop=PSP) == hoefCOP(cop=PSP, p=1) # TRUE
LpCOP(cop=PLACKETTcop, para=1.6, p=2.6)  # 0.1445137 (Fractional p)
## End(Not run)

## Not run: 
set.seed(938) # Phi(1.6; Plackett) = 0.1184489; L_1 = 0.1168737
UV <- simCOP(cop=PLACKETTcop, para=1.6, n=2000, ploton=FALSE, points=FALSE)
hoefCOP(cop=PLACKETTcop, para=1.6, p=200)  # Large p near internal limits
L_1 <- 4*max(abs(PLACKETTcop(UV$U, UV$V, para=1.6) - UV$U*UV$V)) # p is infty
# and finite n and arguably a sample-like statistic here, now on intuition try
# a more sample-like means
U <- runif(10000); V <- runif(10000)
L_2 <- 4*max(abs(EMPIRcop(U, V, para=UV) - U*V)) # 0.1410254 (not close enough)
## End(Not run)

## Not run: 
para <- list(alpha=0.15, beta=0.90, kappa=0.06, gamma=0.96,
             cop1=GHcop, cop2=PLACKETTcop, para1=5.5, para2=0.07)
LpCOPradsym( cop=composite2COP, para=para) # 0.02071164
LpCOPpermsym(cop=composite2COP, para=para) # 0.01540297
## End(Not run)

## Not run: 
"MOcop.formula" <- function(u,v, para=para, ...) {
   alpha <- para[1]; beta <- para[2]; return(min(v*u^(1-alpha), u*v^(1-beta)))
}
"MOcop" <- function(u,v, ...) { asCOP(u,v, f=MOcop.formula, ...) }
   LpCOPradsym( cop=MOcop, para=c(0.8, 0.5)) # 0.0261843
   LpCOPpermsym(cop=MOcop, para=c(0.8, 0.5)) # 0.0243912 
## End(Not run)

[Package copBasic version 2.2.4 Index]