RVineStructureSelect {VineCopula} | R Documentation |
Sequential Specification of R- and C-Vine Copula Models
Description
This function fits either an R- or a C-vine copula model to a d-dimensional
copula data set. Tree structures are determined and appropriate pair-copula
families are selected using BiCopSelect()
and estimated
sequentially (forward selection of trees).
Usage
RVineStructureSelect(
data,
familyset = NA,
type = 0,
selectioncrit = "AIC",
indeptest = FALSE,
level = 0.05,
trunclevel = NA,
progress = FALSE,
weights = NA,
treecrit = "tau",
rotations = TRUE,
se = FALSE,
presel = TRUE,
method = "mle",
cores = 1
)
Arguments
data |
An N x d data matrix (with uniform margins). |
familyset |
An integer vector of pair-copula families to select from.
The vector has to include at least one
pair-copula family that allows for positive and one that allows for negative
dependence. Not listed copula families might be included to better handle
limit cases. If |
type |
Type of the vine model to be specified: |
selectioncrit |
Character indicating the criterion for pair-copula
selection. Possible choices: |
indeptest |
logical; whether a hypothesis test for the independence of
|
level |
numeric; significance level of the independence test
(default: |
trunclevel |
integer; level of truncation. |
progress |
logical; whether the tree-wise specification progress is
printed (default: |
weights |
numeric; weights for each observation (optional). |
treecrit |
edge weight for Dissman's structure selection algorithm, see Details. |
rotations |
If |
se |
Logical; whether standard errors are estimated (default: |
presel |
Logical; whether to exclude families before fitting based on symmetry properties of the data. Makes the selection about 30\ (on average), but may yield slightly worse results in few special cases. |
method |
indicates the estimation method: either maximum
likelihood estimation ( |
cores |
integer; if |
Details
R-vine trees are selected using maximum spanning trees w.r.t. some edge
weights. The most commonly used edge weight is the absolute value of the
empirical Kendall's tau, say \hat{\tau}_{ij}
. Then, the following
optimization problem is solved for each tree:
\max \sum_{\mathrm{edges }\; e_{ij} \in \; \mathrm{ in \; spanning \; tree}} |\hat{\tau}_{ij}|,
where a spanning tree is a tree on all nodes. The setting of the first tree selection step is always a complete graph. For subsequent trees, the setting depends on the R-vine construction principles, in particular on the proximity condition.
Some commonly used edge weights are implemented:
"tau" | absolute value of empirical Kendall's tau. |
"rho" | absolute value of empirical Spearman's rho. |
"AIC" | Akaike information (multiplied by -1). |
"BIC" | Bayesian information criterion (multiplied by -1). |
"cAIC" | corrected Akaike information criterion (multiplied by -1). |
If the data contain NAs, the edge weights in "tau"
and "rho"
are
multiplied by the square root of the proportion of complete observations. This
penalizes pairs where less observations are used.
The criteria "AIC"
, "BIC"
, and "cAIC"
require estimation and
model selection for all possible pairs. This is computationally expensive and
much slower than "tau"
or "rho"
.
The user can also specify a custom function to calculate the edge weights.
The function has to be of type function(u1, u2, weights) ...
and must
return a numeric value. The weights argument must exist, but does not has to
be used. For example, "tau"
(without using weights) can be implemented
as follows:
function(u1, u2, weights)
abs(cor(u1, u2, method = "kendall", use = "complete.obs"))
The root nodes of C-vine trees are determined similarly by identifying the node with strongest dependencies to all other nodes. That is we take the node with maximum column sum in the empirical Kendall's tau matrix.
Note that a possible way to determine the order of the nodes in the D-vine
is to identify a shortest Hamiltonian path in terms of weights
1-|\hat{\tau_{ij}|}
. This can be established for example using the package
TSP. Example code is shown below.
Value
An RVineMatrix()
object with the selected structure
(RVM$Matrix
) and families (RVM$family
) as well as sequentially
estimated parameters stored in RVM$par
and RVM$par2
. The object
is augmented by the following information about the fit:
se , se2 |
standard errors for the parameter estimates; note that these are only approximate since they do not account for the sequential nature of the estimation, |
nobs |
number of observations, |
logLik , pair.logLik |
log likelihood (overall and pairwise) |
AIC , pair.AIC |
Aikaike's Informaton Criterion (overall and pairwise), |
BIC , pair.BIC |
Bayesian's Informaton Criterion (overall and pairwise), |
emptau |
matrix of empirical values of Kendall's tau, |
p.value.indeptest |
matrix of p-values of the independence test. |
Note
For a comprehensive summary of the vine copula model, use
summary(object)
; to see all its contents, use str(object)
.
Author(s)
Jeffrey Dissmann, Eike Brechmann, Ulf Schepsmeier, Thomas Nagler
References
Brechmann, E. C., C. Czado, and K. Aas (2012). Truncated regular vines in high dimensions with applications to financial data. Canadian Journal of Statistics 40 (1), 68-85.
Dissmann, J. F., E. C. Brechmann, C. Czado, and D. Kurowicka (2013). Selecting and estimating regular vine copulae and application to financial returns. Computational Statistics & Data Analysis, 59 (1), 52-69.
See Also
RVineMatrix()
,
BiCop()
,
RVineCopSelect()
,
plot.RVineMatrix()
,
contour.RVineMatrix()
Examples
# load data set
data(daxreturns)
# select the R-vine structure, families and parameters
# using only the first 4 variables and the first 250 observations
# we allow for the copula families: Gauss, t, Clayton, Gumbel, Frank and Joe
daxreturns <- daxreturns[1:250, 1:4]
RVM <- RVineStructureSelect(daxreturns, c(1:6), progress = TRUE)
## see the object's content or a summary
str(RVM)
summary(RVM)
## inspect the fitted model using plots
## Not run: plot(RVM) # tree structure
contour(RVM) # contour plots of all pair-copulas
## estimate a C-vine copula model with only Clayton, Gumbel and Frank copulas
CVM <- RVineStructureSelect(daxreturns, c(3,4,5), "CVine")
## determine the order of the nodes in a D-vine using the package TSP
library(TSP)
d <- dim(daxreturns)[2]
M <- 1 - abs(TauMatrix(daxreturns))
hamilton <- insert_dummy(TSP(M), label = "cut")
sol <- solve_TSP(hamilton, method = "repetitive_nn")
order <- cut_tour(sol, "cut")
DVM <- D2RVine(order, family = rep(0,d*(d-1)/2), par = rep(0, d*(d-1)/2))
RVineCopSelect(daxreturns, c(1:6), DVM$Matrix)