CKT.hCV.l1out {CondCopulas} | R Documentation |
Choose the bandwidth for kernel estimation of conditional Kendall's tau using cross-validation
Description
Let X_1
and X_2
be two random variables.
The goal here is to estimate the conditional Kendall's tau
(a dependence measure) between X_1
and X_2
given Z=z
for a conditioning variable Z
.
Conditional Kendall's tau between X_1
and X_2
given Z=z
is defined as:
P( (X_{1,1} - X_{2,1})(X_{1,2} - X_{2,2}) > 0 | Z_1 = Z_2 = z)
- P( (X_{1,1} - X_{2,1})(X_{1,2} - X_{2,2}) < 0 | Z_1 = Z_2 = z),
where (X_{1,1}, X_{1,2}, Z_1)
and (X_{2,1}, X_{2,2}, Z_2)
are two independent and identically distributed copies of (X_1, X_2, Z)
.
For this, a kernel-based estimator is used, as described in
(Derumigny & Fermanian (2019)).
These functions aims at finding the best bandwidth h
among a given
range_h
by cross-validation. They use either:
-
leave-one-out cross-validation: function
CKT.hCV.l1out
or K-folds cross-validation: function
CKT.hCV.Kfolds
Usage
CKT.hCV.l1out(
observedX1,
observedX2,
observedZ,
range_h,
matrixSignsPairs = NULL,
nPairs = 10 * length(observedX1),
typeEstCKT = "wdm",
kernel.name = "Epa",
progressBar = TRUE,
verbose = FALSE
)
CKT.hCV.Kfolds(
observedX1,
observedX2,
observedZ,
ZToEstimate,
range_h,
matrixSignsPairs = NULL,
typeEstCKT = "wdm",
kernel.name = "Epa",
Kfolds = 5,
progressBar = TRUE,
verbose = FALSE
)
Arguments
observedX1 |
a vector of |
observedX2 |
a vector of |
observedZ |
observedZ vector of observed values of Z. If Z is multivariate, then this is a matrix whose rows correspond to the observations of Z |
range_h |
vector containing possible values for the bandwidth. |
matrixSignsPairs |
square matrix of signs of all pairs,
produced by |
nPairs |
number of pairs used in the cross-validation criteria. |
typeEstCKT |
type of estimation of the conditional Kendall's tau. |
kernel.name |
name of the kernel used for smoothing.
Possible choices are |
progressBar |
if |
verbose |
if |
ZToEstimate |
vector of fixed conditioning values at which the difference between the two conditional Kendall's tau should be computed. Can also be a matrix whose lines are the conditioning vectors at which the difference between the two conditional Kendall's tau should be computed. |
Kfolds |
number of subsamples used. |
Value
Both functions return a list with two components:
-
hCV
: the chosen bandwidth -
scores
: vector of the same length as range_h giving the value of the CV criteria for each of the h tested. Lower score indicates a better fit.
References
Derumigny, A., & Fermanian, J. D. (2019). On kernel-based estimation of conditional Kendall’s tau: finite-distance bounds and asymptotic behavior. Dependence Modeling, 7(1), 292-321. Page 296, Equation (4). doi:10.1515/demo-2019-0016
See Also
CKT.kernel
for the corresponding
estimator of conditional Kendall's tau by kernel smoothing.
Examples
# We simulate from a conditional copula
set.seed(1)
N = 200
Z = rnorm(n = N, mean = 5, sd = 2)
conditionalTau = -0.9 + 1.8 * pnorm(Z, mean = 5, sd = 2)
simCopula = VineCopula::BiCopSim(N=N , family = 1,
par = VineCopula::BiCopTau2Par(1 , conditionalTau ))
X1 = qnorm(simCopula[,1])
X2 = qnorm(simCopula[,2])
newZ = seq(2,10,by = 0.1)
range_h = 3:10
resultCV <- CKT.hCV.l1out(observedX1 = X1, observedX2 = X2,
range_h = range_h, observedZ = Z, nPairs = 100)
resultCV <- CKT.hCV.Kfolds(observedX1 = X1, observedX2 = X2,
range_h = range_h, observedZ = Z, ZToEstimate = newZ)
plot(range_h, resultCV$scores, type = "b")