R: Kendall's regression: choice of the penalization parameter by...

CKT.KendallReg.LambdaCV {CondCopulas}

R Documentation

Kendall's regression: choice of the penalization parameter by K-folds cross-validation

Description

In this model, three variables X_1, X_2 and Z are observed. We try to model the conditional Kendall's tau between X_1 and X_2 conditionally to Z=z, as follows:

\Lambda(\tau_{X_1, X_2 | Z = z}) = \sum_{i=1}^{p'} \beta_i \psi_i(z),

where \tau_{X_1, X_2 | Z = z} is the conditional Kendall's tau between X_1 and X_2 conditionally to Z=z, \Lambda is a function from ]-1, 1[] to R, (\beta_1, \dots, \beta_p) are unknown coefficients to be estimated and \psi_1, \dots, \psi_{p'}) are a dictionary of functions. To estimate beta, we used the penalized estimator which is defined as the minimizer of the following criteria

\frac{1}{2n'} \sum_{i=1}^{n'} [\Lambda(\hat\tau_{X_1, X_2 | Z = z}) - \sum_{j=1}^{p'} \beta_j \psi_j(z)]^2 + \lambda * |\beta|_1.

This function chooses the penalization parameter lambda by cross-validation.

Usage

CKT.KendallReg.LambdaCV(
  observedX1,
  observedX2,
  observedZ,
  ZToEstimate,
  designMatrixZ = cbind(ZToEstimate, ZToEstimate^2, ZToEstimate^3),
  typeEstCKT = 4,
  h_lambda,
  Lambda = identity,
  kernel.name = "Epa",
  Kfolds_lambda = 10,
  l_norm = 1,
  matrixSignsPairs = NULL,
  progressBars = "global"
)

Arguments

`observedX1`	a vector of n observations of the first variable `X_1`.
`observedX2`	a vector of n observations of the second variable `X_2`.
`observedZ`	a vector of n observations of the conditioning variable, or a matrix with n rows of observations of the conditioning vector (if `Z` is multivariate).
`ZToEstimate`	the new data of observations of Z at which the conditional Kendall's tau should be estimated.
`designMatrixZ`	the transformation of the ZToEstimate that will be used as predictors. By default, no transformation is applied.
`typeEstCKT`	type of estimation of the conditional Kendall's tau.
`h_lambda`	the smoothing bandwidth used in the cross-validation procedure to choose `lambda`.
`Lambda`	the function to be applied on conditional Kendall's tau. By default, the identity function is used.
`kernel.name`	name of the kernel. Possible choices are "Gaussian" (Gaussian kernel) and "Epa" (Epanechnikov kernel).
`Kfolds_lambda`	the number of folds used in the cross-validation procedure to choose `lambda`.
`l_norm`	type of norm used for selection of the optimal lambda. l_norm=1 corresponds to the sum of absolute values of differences between predicted and estimated conditional Kendall's tau while l_norm=2 corresponds to the sum of squares of differences.
`matrixSignsPairs`	the results of a call to `computeMatrixSignPairs` (if already computed). If `NULL` (the default value), the `matrixSignsPairs` will be computed again from the data.
`progressBars`	should progress bars be displayed? Possible values are `"none"`: no progress bar at all. `"global"`: only one global progress bar (default behavior) `"eachStep"`: uses a global progress bar + one progress bar for each kernel smoothing step.

Value

A list with the following components

lambdaCV: the chosen value of the penalization parameters lambda.
vectorLambda: a vector containing the values of lambda that have been compared.
vectorMSEMean: the estimated MSE for each value of lambda in vectorLambda
vectorMSESD: the estimated standard deviation of the MSE for each lambda. It can be used to construct confidence intervals for estimates of the MSE given by vectorMSEMean.

References

Derumigny, A., & Fermanian, J. D. (2020). On Kendall’s regression. Journal of Multivariate Analysis, 178, 104610.

Examples

# We simulate from a conditional copula
set.seed(1)
N = 400
Z = rnorm(n = N, mean = 5, sd = 2)
conditionalTau = -0.9 + 1.8 * pnorm(Z, mean = 5, sd = 2)
simCopula = VineCopula::BiCopSim(N=N , family = 1,
    par = VineCopula::BiCopTau2Par(1 , conditionalTau ))
X1 = qnorm(simCopula[,1])
X2 = qnorm(simCopula[,2])

newZ = seq(2, 10, by = 0.1)
result <- CKT.KendallReg.LambdaCV(
   observedX1 = X1, observedX2 = X2, observedZ = Z,
   ZToEstimate = newZ, h_lambda = 2)

plot(x = result$vectorLambda, y = result$vectorMSEMean,
     type = "l", log = "x")

[Package CondCopulas version 0.1.3 Index]