R: Model (copula) selection based on 'k'-fold cross-validation

xvCopula {copula}

R Documentation

Model (copula) selection based on `k`-fold cross-validation

Description

Computes the leave-one-out cross-validation criterion (or a k-fold version of it) for the hypothesized parametric copula family using, by default, maximum pseudo-likelihood estimation.

The leave-one-out criterion is a crossvalidated log likelihood. It is denoted by \widehat{xv}_n in Grønneberg and Hjort (2014) and defined in equation (42) therein. When computed for several parametric copula families, it is thus meaningful to select the family maximizing the criterion.

For k < n, n the sample size, the k-fold version is an approximation of the leave-one-out criterion that uses k randomly chosen (almost) equally sized data blocks instead of n. When n is large, k-fold cross-validation is considerably faster (if k is “small” compared to n).

Usage

xvCopula(copula, x, k = NULL, verbose = interactive(),
         ties.method = eval(formals(rank)$ties.method), ...)

Arguments

`copula`	object of class `"copula"` representing the hypothesized copula family.
`x`	a data matrix that will be transformed to pseudo-observations.
`k`	the number of data blocks; if `k = NULL`, `nrow(x)` blocks are considered (which corresponds to leave-one-out cross-validation).
`verbose`	a logical indicating if progress of the cross validation should be displayed via `txtProgressBar`.
`ties.method`	string specifying how ranks should be computed if there are ties in any of the coordinate samples of `x` and fitting is based on maximum pseudo-likelihood; passed to `pobs`.
`...`	additional arguments passed to `fitCopula()`.

Value

A real number equal to the cross-validation criterion multiplied by the sample size.

Note

Note that k-fold cross-validation with k < n shuffles the lines of x prior to forming the blocks. The result thus depends on the value of the random seed.

The default estimation method is maximum pseudo-likelihood estimation but this can be changed if necessary along with all the other arguments of fitCopula().

References

Grønneberg, S., and Hjort, N.L. (2014) The copula information criteria. Scandinavian Journal of Statistics 41, 436–459.

Examples


## A two-dimensional data example ----------------------------------
x <- rCopula(200, claytonCopula(3))


## Model (copula) selection -- takes time: each fits 200 copulas to 199 obs.
xvCopula(gumbelCopula(), x)
xvCopula(frankCopula(), x)
xvCopula(joeCopula(), x)
xvCopula(claytonCopula(), x)
xvCopula(normalCopula(), x)
xvCopula(tCopula(), x)
xvCopula(plackettCopula(), x)


## The same with 5-fold cross-validation [to save time ...]
set.seed(1) # k-fold is random (for k < n) !
xvCopula(gumbelCopula(),  x, k=5)
xvCopula(frankCopula(),   x, k=5)
xvCopula(joeCopula(),     x, k=5)
xvCopula(claytonCopula(), x, k=5)
xvCopula(normalCopula(),  x, k=5)
xvCopula(tCopula(),       x, k=5)
xvCopula(plackettCopula(),x, k=5)

[Package copula version 1.1-3 Index]

Model (copula) selection based on k-fold cross-validation