fitCopula {copula}R Documentation

Fitting Copulas to Data – Copula Parameter Estimation

Description

Parameter estimation of copulas, i.e., fitting of a copula model to multivariate (possibly “pseudo”) observations.

Usage

loglikCopula(param = getTheta(copula), u, copula,
             error = c("-Inf", "warn-Inf", "let-it-be"))

loglikCopulaMany(pList, u, copula)

## Generic [and "rotCopula" method] : %- ../R/fitCopula.R
fitCopula(copula, data, ...)
## S4 method for signature 'parCopula'
fitCopula(copula, data,
          method = c("mpl", "ml", "itau", "irho", "itau.mpl"),
          posDef = is(copula, "ellipCopula"),
          start = NULL, lower = NULL, upper = NULL,
          optim.method = optimMeth(copula, method, dim = d),
          optim.control = list(maxit=1000),
          estimate.variance = NA, hideWarnings = FALSE, ...)

optimMeth(copula, method, dim)

Arguments

param

vector of free (see isFree() and getTheta()) parameter values.

pList

a list of free parameter vectors (as param above). In the 1D case, length(param) == 1, may also be a numeric vector.

u

n\times d-matrix of (pseudo-)observations in [0,1]^d for computing the copula log-likelihood, where n denotes the sample size and d the dimension. Consider applying the function pobs() first in order to obtain such data.

data

as u, an n\times d-matrix of data. For method being "mpl", "ml" or "itau.mpl", this has to be data in [0,1]^d. For method being "itau" or "irho", it can either be data in [0,1]^d or in the whole d-dimensional space.

copula

a "copula" object.

error

(for loglikCopula():) a character string specifying how errors in the underlying dCopula() calls should be handled:

"-Inf":

the value of the log likelihood should silently be set to -Inf.

"warn-Inf":

signal a warning about the error and set the value to -Inf.

"let-it-be":

the error is signalled and hence the likelihood computation fails.

method

a character string specifying the copula parameter estimator used. This can be one of:

"mpl"

Maximum pseudo-likelihood estimator (based on “pseudo-observations” in [0,1]^d, typical obtained via pobs()).

"ml"

As "mpl" just with a different variance estimator. For this to be correct (thus giving the true MLE), data are assumed to be observations from the true underlying copula whose parameter is to be estimated.

"itau"

Inversion of Kendall's tau estimator. data can be either in [0,1]^d (true or pseudo-observations of the underlying copula to be estimated) or in the d-dimensional space.

"irho"

As "itau" just with Spearman's rho instead of Kendall's tau.

"itau.mpl"

This is the estimator of t copula parameters suggested by Mashal and Zeevi (2002) based on the idea of inverting Kendall's tau for estimating the correlation matrix as introduced in a RiskLab report in 2001 later published as Embrechts et al. (2003); see also Demarta and McNeil (2005). The given data has to be in [0,1]^d (either true or pseudo-observations of the underlying copula to be estimated). Note that this method requires dispstr = "un".

posDef

a logical indicating whether a proper correlation matrix is computed.

start

a vector of starting values for the parameter optimization via optim().

lower, upper

Lower or upper parameter bounds for the optimization methods "Brent" or "L-BFGS-B".

optim.control

a list of control parameters passed to optim(*, control=optim.control).

optim.method

a character string specify the optimization method or a function which when called with arguments (copula, method, dim) will return such a character string, see optim()'s method; only used when method = "mpl" or "ml".

The default has been changed (for copula 0.999-16, in Aug. 2016) from "BFGS" to the result of optimMeth(copula, method, dim) which is often "L-BFGS-B".

dim

integer, the data and copula dimension, d \ge 2.

estimate.variance

a logical indicating whether the estimator's asymptotic variance is computed (if available for the given copula; the default NA computes it for the methods "itau" and "irho", cannot (yet) compute it for "itau.mpl" and only computes it for "mpl" or "ml" if the optimization converged).

hideWarnings

a logical, which, if TRUE, suppresses warnings from the involved likelihood maximization (typically when the likelihood is evaluated at invalid parameter values).

...

additional arguments passed to method specific auxiliary functions, e.g., traceOpt = TRUE (or traceOpt = 10) for tracing optimize (every 10-th function evaluation) for method "itau.mpl", and for “manual” tracing with method "ml" or "mpl" also showing parameter values (notably for optim.method="Brent"), see the extra arguments of namespace-hidden function fitCopula.ml().

Details

The only difference between "mpl" and "ml" is in the variance-covariance estimate, not in the parameter (\theta) estimates.

If method "mpl" in fitCopula() is used and if start is not assigned a value, estimates obtained from method "itau" are used as initial values in the optimization. Standard errors are computed as explained in Genest, Ghoudi and Rivest (1995); see also Kojadinovic and Yan (2010, Section 3). Their estimation requires the computation of certain partial derivatives of the (log) density. These have been implemented for six copula families thus far: the Clayton, Gumbel-Hougaard, Frank, Plackett, normal and t copula families. For other families, numerical differentiation based on grad() from package numDeriv is used (and a warning message is displayed).

In the multiparameter elliptical case and when the estimation is based on Kendall's tau or Spearman's rho, the estimated correlation matrix may not always be positive-definite. In that case, nearPD(*, corr=TRUE) (from Matrix) is applied to get a proper correlation matrix.

For normal and t copulas, fitCopula(, method = "mpl") and fitCopula(, method = "ml") maximize the log-likelihood based on mvtnorm's dmvnorm() and dmvt(), respectively. The latter two functions set the respective densities to zero if the correlation matrices of the corresponding distributions are not positive definite. As such, the estimated correlation matrices will be positive definite.

If methods "itau" or "irho" are used in fitCopula(), an estimate of the asymptotic variance (if available for the copula under consideration) will be correctly computed only if the argument data consists of pseudo-observations (see pobs()).

Consider the t copula with df.fixed=FALSE (see ellipCopula()). In this case, the methods "itau" and "irho" cannot be used in fitCopula() as they cannot estimate the degrees of freedom parameter df. For the methods "mpl" and "itau.mpl" the asymptotic variance cannot be (fully) estimated (yet). For the methods "ml" and "mpl", when start is not specified, the starting value for df is set to copula@df, typically 4.

To implement the Inference Functions for Margins (IFM) method (see, e.g., Joe 2005), set method="ml" and note that data need to be parametric pseudo-observations obtained from fitted parametric marginal distribution functions. The returned large-sample variance will then underestimate the true variance (as the procedure cannot take into account the (unknown) estimation error for the margins).

The fitting procedures based on optim() generate warnings because invalid parameter values are tried during the optimization process. When the number of parameters is one and the parameter space is bounded, using optim.method="Brent" is likely to give less warnings. Furthermore, from experience, optim.method="Nelder-Mead" is sometimes a more robust alternative to optim.method="BFGS" or "L-BFGS-B".

There are methods for vcov(), coef(), logLik(), and nobs().

Value

loglikCopula() returns the copula log-likelihood evaluated at the parameter (vector) param given the data u.

loglikCopulaMany() returns a numeric vector of such log-likelihoods; it assumes consistent parameter values, corresponding to loglikCopula()'s error = "let-it-be", for speed.

The return value of fitCopula() is an object of class "fitCopula" (inheriting from hidden class "fittedMV"), containing (among others!) the slots

estimate

The parameter estimates.

var.est

The large-sample (i.e., asymptotic) variance estimate of the parameter estimator unless estimate.variance=FALSE where it is matrix(numeric(), 0,0) (to be distinguishable from cases when the covariance estimates failed partially).

copula

The fitted copula object.

The summary() method for "fitCopula" objects returns an S3 “class” "summary.fitCopula", which is simply a list with components method, loglik and convergence, all three from the corresponding slots of the "fitCopula" objects, and coefficients (a matrix of estimated coefficients, standard errors, t values and p-values).

References

Genest, C. (1987). Frank's family of bivariate distributions. Biometrika 74, 549–555.

Genest, C. and Rivest, L.-P. (1993). Statistical inference procedures for bivariate Archimedean copulas. Journal of the American Statistical Association 88, 1034–1043.

Rousseeuw, P. and Molenberghs, G. (1993). Transformation of nonpositive semidefinite correlation matrices. Communications in Statistics: Theory and Methods 22, 965–984.

Genest, C., Ghoudi, K., and Rivest, L.-P. (1995). A semiparametric estimation procedure of dependence parameters in multivariate families of distributions. Biometrika 82, 543–552.

Joe, H. (2005). Asymptotic efficiency of the two-stage estimation method for copula-based models. Journal of Multivariate Analysis 94, 401–419.

Mashal, R. and Zeevi, A. (2002). Beyond Correlation: Extreme Co-movements Between Financial Assets. https://www0.gsb.columbia.edu/faculty/azeevi/PAPERS/BeyondCorrelation.pdf (2016-04-05)

Demarta, S. and McNeil, A. J. (2005). The t copula and related copulas. International Statistical Review 73, 111–129.

Genest, C. and Favre, A.-C. (2007). Everything you always wanted to know about copula modeling but were afraid to ask. Journal of Hydrologic Engineering 12, 347–368.

Kojadinovic, I. and Yan, J. (2010). Comparison of three semiparametric methods for estimating dependence parameters in copula models. Insurance: Mathematics and Economics 47, 52–63.

See Also

Copula, fitMvdc for fitting multivariate distributions including the margins, gofCopula for goodness-of-fit tests.

For maximum likelihood of (nested) Archimedean copulas, see emle, etc.

Examples

(Xtras <- copula:::doExtras()) # determine whether examples will be extra (long)
n <- if(Xtras) 200 else 64 # sample size

## A Gumbel copula
set.seed(7) # for reproducibility
gumbel.cop <- gumbelCopula(3, dim=2)
x <- rCopula(n, gumbel.cop) # "true" observations (simulated)
u <- pobs(x)                # pseudo-observations
## Inverting Kendall's tau
fit.tau <- fitCopula(gumbelCopula(), u, method="itau")
fit.tau
confint(fit.tau) # work fine !
confint(fit.tau, level = 0.98)
summary(fit.tau) # a bit more, notably "Std. Error"s
coef(fit.tau)# named vector
coef(fit.tau, SE = TRUE)# matrix

## Inverting Spearman's rho
fit.rho <- fitCopula(gumbelCopula(), u, method="irho")
summary(fit.rho)
## Maximum pseudo-likelihood
fit.mpl <- fitCopula(gumbelCopula(), u, method="mpl")
fit.mpl
## Maximum likelihood -- use 'x', not 'u' ! --
fit.ml <- fitCopula(gumbelCopula(), x, method="ml")
summary(fit.ml) # now prints a bit more than simple 'fit.ml'
## ... and what's the log likelihood (in two different ways):
(ll. <- logLik(fit.ml))
stopifnot(all.equal(as.numeric(ll.),
            loglikCopula(coef(fit.ml), u=x, copula=gumbel.cop)))

## A Gauss/normal copula

## With multiple/*un*constrained parameters
set.seed(6) # for reproducibility
normal.cop <- normalCopula(c(0.6, 0.36, 0.6), dim=3, dispstr="un")
x <- rCopula(n, normal.cop) # "true" observations (simulated)
u <- pobs(x)                # pseudo-observations
## Inverting Kendall's tau
fit.tau <- fitCopula(normalCopula(dim=3, dispstr="un"), u, method="itau")
fit.tau
## Inverting Spearman's rho
fit.rho <- fitCopula(normalCopula(dim=3, dispstr="un"), u, method="irho")
fit.rho
## Maximum pseudo-likelihood
fit.mpl <- fitCopula(normalCopula(dim=3, dispstr="un"), u, method="mpl")
summary(fit.mpl)
coef(fit.mpl) # named vector
coef(fit.mpl, SE = TRUE) # the matrix, with SE
## Maximum likelihood (use 'x', not 'u' !)
fit.ml <- fitCopula(normalCopula(dim=3, dispstr="un"), x, method="ml", traceOpt=TRUE)
summary(fit.ml)
confint(fit.ml)
confint(fit.ml, level = 0.999) # clearly non-0

## Fix some of the parameters
param <- c(.6, .3, NA_real_)
attr(param, "fixed") <- c(TRUE, FALSE, FALSE)
ncp <- normalCopula(param = param, dim = 3, dispstr = "un")
fixedParam(ncp) <- c(TRUE, TRUE, FALSE)
## 'traceOpt = 5': showing every 5-th log likelihood evaluation:
summary(Fxf.mpl <- fitCopula(ncp, u, method = "mpl", traceOpt = 5))
Fxf.mpl@copula # reminding of the fixed param. values

## With dispstr = "toep" :
normal.cop.toep <- normalCopula(c(0, 0), dim=3, dispstr="toep")
## Inverting Kendall's tau
fit.tau <- fitCopula(normalCopula(dim=3, dispstr="toep"), u, method="itau")
fit.tau
## Inverting Spearman's rho
fit.rho <- fitCopula(normalCopula(dim=3, dispstr="toep"), u, method="irho")
summary(fit.rho)
## Maximum pseudo-likelihood
fit.mpl <- fitCopula(normalCopula(dim=3, dispstr="toep"), u, method="mpl")
fit.mpl
## Maximum likelihood (use 'x', not 'u' !)
fit.ml <- fitCopula(normalCopula(dim=3, dispstr="toep"), x, method="ml")
summary(fit.ml)

## With dispstr = "ar1"
normal.cop.ar1 <- normalCopula(c(0), dim=3, dispstr="ar1")
## Inverting Kendall's tau
summary(fit.tau <- fitCopula(normalCopula(dim=3, dispstr="ar1"), u, method="itau"))
## Inverting Spearman's rho
summary(fit.rho <- fitCopula(normalCopula(dim=3, dispstr="ar1"), u, method="irho"))
## Maximum pseudo-likelihood
summary(fit.mpl <- fitCopula(normalCopula(dim=3, dispstr="ar1"), u, method="mpl"))
## Maximum likelihood (use 'x', not 'u' !)
fit.ml <- fitCopula(normalCopula(dim=3, dispstr="ar1"), x, method="ml")
summary(fit.ml)

## A t copula with variable df (df.fixed=FALSE)
(tCop <- tCopula(c(0.2,0.4,0.6), dim=3, dispstr="un", df=5))
set.seed(101)
x <- rCopula(n, tCop) # "true" observations (simulated)
## Maximum likelihood (start = (rho[1:3], df))
summary(tc.ml <- fitCopula(tCopula(dim=3, dispstr="un"), x, method="ml",
                           start = c(0,0,0, 10)))
## Maximum pseudo-likelihood (the asymptotic variance cannot be estimated)
u <- pobs(x)          # pseudo-observations
tc.mpl <- fitCopula(tCopula(dim=3, dispstr="un"),
                     u, method="mpl", estimate.variance=FALSE,
                     start= c(0,0,0, 10))
summary(tc.mpl)

[Package copula version 1.1-3 Index]