R: Choose the bandwidth for kernel estimation of conditional...

CKT.hCV.l1out {CondCopulas}

R Documentation

Choose the bandwidth for kernel estimation of conditional Kendall's tau using cross-validation

Description

Let X_1 and X_2 be two random variables. The goal here is to estimate the conditional Kendall's tau (a dependence measure) between X_1 and X_2 given Z=z for a conditioning variable Z. Conditional Kendall's tau between X_1 and X_2 given Z=z is defined as:

P( (X_{1,1} - X_{2,1})(X_{1,2} - X_{2,2}) > 0 | Z_1 = Z_2 = z)

- P( (X_{1,1} - X_{2,1})(X_{1,2} - X_{2,2}) < 0 | Z_1 = Z_2 = z),

where (X_{1,1}, X_{1,2}, Z_1) and (X_{2,1}, X_{2,2}, Z_2) are two independent and identically distributed copies of (X_1, X_2, Z). For this, a kernel-based estimator is used, as described in (Derumigny & Fermanian (2019)). These functions aims at finding the best bandwidth h among a given range_h by cross-validation. They use either:

leave-one-out cross-validation: function CKT.hCV.l1out
or K-folds cross-validation: function CKT.hCV.Kfolds

Usage

CKT.hCV.l1out(
  observedX1,
  observedX2,
  observedZ,
  range_h,
  matrixSignsPairs = NULL,
  nPairs = 10 * length(observedX1),
  typeEstCKT = "wdm",
  kernel.name = "Epa",
  progressBar = TRUE,
  verbose = FALSE
)

CKT.hCV.Kfolds(
  observedX1,
  observedX2,
  observedZ,
  ZToEstimate,
  range_h,
  matrixSignsPairs = NULL,
  typeEstCKT = "wdm",
  kernel.name = "Epa",
  Kfolds = 5,
  progressBar = TRUE,
  verbose = FALSE
)

Arguments

`observedX1`	a vector of `n` observations of the first variable
`observedX2`	a vector of `n` observations of the second variable
`observedZ`	observedZ vector of observed values of Z. If Z is multivariate, then this is a matrix whose rows correspond to the observations of Z
`range_h`	vector containing possible values for the bandwidth.
`matrixSignsPairs`	square matrix of signs of all pairs, produced by `computeMatrixSignPairs(observedX1, observedX2)`. Only needed if `typeEstCKT` is not the default 'wdm'.
`nPairs`	number of pairs used in the cross-validation criteria.
`typeEstCKT`	type of estimation of the conditional Kendall's tau.
`kernel.name`	name of the kernel used for smoothing. Possible choices are `"Gaussian"` (Gaussian kernel) and `"Epa"` (Epanechnikov kernel).
`progressBar`	if `TRUE`, a progressbar for each h is displayed to show the progress of the computation.
`verbose`	if `TRUE`, print the score of each h during the procedure.
`ZToEstimate`	vector of fixed conditioning values at which the difference between the two conditional Kendall's tau should be computed. Can also be a matrix whose lines are the conditioning vectors at which the difference between the two conditional Kendall's tau should be computed.
`Kfolds`	number of subsamples used.

Value

Both functions return a list with two components:

hCV: the chosen bandwidth
scores: vector of the same length as range_h giving the value of the CV criteria for each of the h tested. Lower score indicates a better fit.

References

Derumigny, A., & Fermanian, J. D. (2019). On kernel-based estimation of conditional Kendall’s tau: finite-distance bounds and asymptotic behavior. Dependence Modeling, 7(1), 292-321. Page 296, Equation (4). doi:10.1515/demo-2019-0016

Examples

# We simulate from a conditional copula
set.seed(1)
N = 200
Z = rnorm(n = N, mean = 5, sd = 2)
conditionalTau = -0.9 + 1.8 * pnorm(Z, mean = 5, sd = 2)
simCopula = VineCopula::BiCopSim(N=N , family = 1,
    par = VineCopula::BiCopTau2Par(1 , conditionalTau ))
X1 = qnorm(simCopula[,1])
X2 = qnorm(simCopula[,2])

newZ = seq(2,10,by = 0.1)
range_h = 3:10

resultCV <- CKT.hCV.l1out(observedX1 = X1, observedX2 = X2,
  range_h = range_h, observedZ = Z, nPairs = 100)

resultCV <- CKT.hCV.Kfolds(observedX1 = X1, observedX2 = X2,
  range_h = range_h, observedZ = Z, ZToEstimate = newZ)

plot(range_h, resultCV$scores, type = "b")

[Package CondCopulas version 0.1.3 Index]