fitCopula {copula} | R Documentation |
Fitting Copulas to Data – Copula Parameter Estimation
Description
Parameter estimation of copulas, i.e., fitting of a copula model to multivariate (possibly “pseudo”) observations.
Usage
loglikCopula(param = getTheta(copula), u, copula,
error = c("-Inf", "warn-Inf", "let-it-be"))
loglikCopulaMany(pList, u, copula)
## Generic [and "rotCopula" method] : %- ../R/fitCopula.R
fitCopula(copula, data, ...)
## S4 method for signature 'parCopula'
fitCopula(copula, data,
method = c("mpl", "ml", "itau", "irho", "itau.mpl"),
posDef = is(copula, "ellipCopula"),
start = NULL, lower = NULL, upper = NULL,
optim.method = optimMeth(copula, method, dim = d),
optim.control = list(maxit=1000),
estimate.variance = NA, hideWarnings = FALSE, ...)
optimMeth(copula, method, dim)
Arguments
param |
vector of free (see |
pList |
a |
u |
|
data |
as |
copula |
a |
error |
(for
|
method |
a
|
posDef |
a |
start |
a |
lower , upper |
Lower or upper parameter bounds for the
optimization methods |
optim.control |
a |
optim.method |
a character string specify the optimization
method or a The default has been changed (for copula 0.999-16, in
Aug. 2016) from |
dim |
integer, the data and copula dimension, |
estimate.variance |
a |
hideWarnings |
a |
... |
additional arguments passed to |
Details
The only difference between "mpl"
and "ml"
is in the
variance-covariance estimate, not in the parameter
(\theta
) estimates.
If method "mpl"
in fitCopula()
is used and if
start
is not assigned a value, estimates obtained from method
"itau"
are used as initial values in the optimization. Standard
errors are computed as explained in Genest, Ghoudi and Rivest (1995);
see also Kojadinovic and Yan (2010, Section 3). Their estimation
requires the computation of certain partial derivatives of the (log)
density. These have been implemented for six copula families thus far:
the Clayton, Gumbel-Hougaard, Frank, Plackett, normal and t
copula
families. For other families, numerical differentiation based on
grad()
from package numDeriv
is used (and a warning message is displayed).
In the multiparameter elliptical case and when the estimation is based
on Kendall's tau or Spearman's rho, the estimated correlation matrix
may not always be positive-definite. In that case,
nearPD(*, corr=TRUE)
(from Matrix) is
applied to get a proper correlation matrix.
For normal and t
copulas, fitCopula(, method = "mpl")
and
fitCopula(, method = "ml")
maximize the log-likelihood based on
mvtnorm's dmvnorm()
and dmvt()
, respectively.
The latter two functions set the respective densities to zero if the
correlation matrices of the corresponding distributions are not
positive definite. As such, the estimated correlation matrices
will be positive definite.
If methods "itau"
or "irho"
are used in
fitCopula()
, an estimate of the asymptotic variance (if
available for the copula under consideration) will be correctly
computed only if the argument data
consists of pseudo-observations
(see pobs()
).
Consider the t
copula with df.fixed=FALSE
(see
ellipCopula()
). In this case, the methods "itau"
and "irho"
cannot be used in fitCopula()
as they cannot
estimate the degrees of freedom parameter df
. For the methods
"mpl"
and "itau.mpl"
the asymptotic variance cannot be
(fully) estimated (yet). For the methods "ml"
and "mpl"
,
when start
is not specified, the starting value for df
is set to copula@df
, typically 4.
To implement the Inference Functions for Margins (IFM) method
(see, e.g., Joe 2005), set method="ml"
and note that
data
need to be parametric pseudo-observations obtained from fitted
parametric marginal distribution functions. The returned large-sample
variance will then underestimate the true variance (as the procedure
cannot take into account the (unknown) estimation error for the margins).
The fitting procedures based on optim()
generate
warnings because invalid parameter values are tried during the
optimization process. When the number of parameters is one and the
parameter space is bounded, using optim.method="Brent"
is
likely to give less warnings. Furthermore, from experience,
optim.method="Nelder-Mead"
is sometimes a more robust
alternative to optim.method="BFGS"
or "L-BFGS-B"
.
There are methods for vcov()
, coef()
,
logLik()
, and nobs()
.
Value
loglikCopula()
returns the copula log-likelihood evaluated at the
parameter (vector) param
given the data u
.
loglikCopulaMany()
returns a numeric vector of such
log-likelihoods; it assumes consistent parameter values, corresponding to
loglikCopula()
's error = "let-it-be"
, for speed.
The return value of fitCopula()
is an object of class
"fitCopula"
(inheriting from hidden class
"fittedMV"
), containing (among others!) the slots
- estimate
The parameter estimates.
- var.est
The large-sample (i.e., asymptotic) variance estimate of the parameter estimator unless
estimate.variance=FALSE
where it ismatrix(numeric(), 0,0)
(to be distinguishable from cases when the covariance estimates failed partially).- copula
The fitted copula object.
The summary()
method for "fitCopula"
objects
returns an S3 “class” "summary.fitCopula"
, which is simply a list
with components method
, loglik
and convergence
,
all three from the corresponding slots of the
"fitCopula"
objects, and coefficients
(a matrix of estimated coefficients, standard errors, t values and p-values).
References
Genest, C. (1987). Frank's family of bivariate distributions. Biometrika 74, 549–555.
Genest, C. and Rivest, L.-P. (1993). Statistical inference procedures for bivariate Archimedean copulas. Journal of the American Statistical Association 88, 1034–1043.
Rousseeuw, P. and Molenberghs, G. (1993). Transformation of nonpositive semidefinite correlation matrices. Communications in Statistics: Theory and Methods 22, 965–984.
Genest, C., Ghoudi, K., and Rivest, L.-P. (1995). A semiparametric estimation procedure of dependence parameters in multivariate families of distributions. Biometrika 82, 543–552.
Joe, H. (2005). Asymptotic efficiency of the two-stage estimation method for copula-based models. Journal of Multivariate Analysis 94, 401–419.
Mashal, R. and Zeevi, A. (2002). Beyond Correlation: Extreme Co-movements Between Financial Assets. https://www0.gsb.columbia.edu/faculty/azeevi/PAPERS/BeyondCorrelation.pdf (2016-04-05)
Demarta, S. and McNeil, A. J. (2005). The t copula and related copulas. International Statistical Review 73, 111–129.
Genest, C. and Favre, A.-C. (2007). Everything you always wanted to know about copula modeling but were afraid to ask. Journal of Hydrologic Engineering 12, 347–368.
Kojadinovic, I. and Yan, J. (2010). Comparison of three semiparametric methods for estimating dependence parameters in copula models. Insurance: Mathematics and Economics 47, 52–63.
See Also
Copula
,
fitMvdc
for fitting multivariate distributions
including the margins,
gofCopula
for goodness-of-fit tests.
For maximum likelihood of (nested) Archimedean copulas, see
emle
, etc.
Examples
(Xtras <- copula:::doExtras()) # determine whether examples will be extra (long)
n <- if(Xtras) 200 else 64 # sample size
## A Gumbel copula
set.seed(7) # for reproducibility
gumbel.cop <- gumbelCopula(3, dim=2)
x <- rCopula(n, gumbel.cop) # "true" observations (simulated)
u <- pobs(x) # pseudo-observations
## Inverting Kendall's tau
fit.tau <- fitCopula(gumbelCopula(), u, method="itau")
fit.tau
confint(fit.tau) # work fine !
confint(fit.tau, level = 0.98)
summary(fit.tau) # a bit more, notably "Std. Error"s
coef(fit.tau)# named vector
coef(fit.tau, SE = TRUE)# matrix
## Inverting Spearman's rho
fit.rho <- fitCopula(gumbelCopula(), u, method="irho")
summary(fit.rho)
## Maximum pseudo-likelihood
fit.mpl <- fitCopula(gumbelCopula(), u, method="mpl")
fit.mpl
## Maximum likelihood -- use 'x', not 'u' ! --
fit.ml <- fitCopula(gumbelCopula(), x, method="ml")
summary(fit.ml) # now prints a bit more than simple 'fit.ml'
## ... and what's the log likelihood (in two different ways):
(ll. <- logLik(fit.ml))
stopifnot(all.equal(as.numeric(ll.),
loglikCopula(coef(fit.ml), u=x, copula=gumbel.cop)))
## A Gauss/normal copula
## With multiple/*un*constrained parameters
set.seed(6) # for reproducibility
normal.cop <- normalCopula(c(0.6, 0.36, 0.6), dim=3, dispstr="un")
x <- rCopula(n, normal.cop) # "true" observations (simulated)
u <- pobs(x) # pseudo-observations
## Inverting Kendall's tau
fit.tau <- fitCopula(normalCopula(dim=3, dispstr="un"), u, method="itau")
fit.tau
## Inverting Spearman's rho
fit.rho <- fitCopula(normalCopula(dim=3, dispstr="un"), u, method="irho")
fit.rho
## Maximum pseudo-likelihood
fit.mpl <- fitCopula(normalCopula(dim=3, dispstr="un"), u, method="mpl")
summary(fit.mpl)
coef(fit.mpl) # named vector
coef(fit.mpl, SE = TRUE) # the matrix, with SE
## Maximum likelihood (use 'x', not 'u' !)
fit.ml <- fitCopula(normalCopula(dim=3, dispstr="un"), x, method="ml", traceOpt=TRUE)
summary(fit.ml)
confint(fit.ml)
confint(fit.ml, level = 0.999) # clearly non-0
## Fix some of the parameters
param <- c(.6, .3, NA_real_)
attr(param, "fixed") <- c(TRUE, FALSE, FALSE)
ncp <- normalCopula(param = param, dim = 3, dispstr = "un")
fixedParam(ncp) <- c(TRUE, TRUE, FALSE)
## 'traceOpt = 5': showing every 5-th log likelihood evaluation:
summary(Fxf.mpl <- fitCopula(ncp, u, method = "mpl", traceOpt = 5))
Fxf.mpl@copula # reminding of the fixed param. values
## With dispstr = "toep" :
normal.cop.toep <- normalCopula(c(0, 0), dim=3, dispstr="toep")
## Inverting Kendall's tau
fit.tau <- fitCopula(normalCopula(dim=3, dispstr="toep"), u, method="itau")
fit.tau
## Inverting Spearman's rho
fit.rho <- fitCopula(normalCopula(dim=3, dispstr="toep"), u, method="irho")
summary(fit.rho)
## Maximum pseudo-likelihood
fit.mpl <- fitCopula(normalCopula(dim=3, dispstr="toep"), u, method="mpl")
fit.mpl
## Maximum likelihood (use 'x', not 'u' !)
fit.ml <- fitCopula(normalCopula(dim=3, dispstr="toep"), x, method="ml")
summary(fit.ml)
## With dispstr = "ar1"
normal.cop.ar1 <- normalCopula(c(0), dim=3, dispstr="ar1")
## Inverting Kendall's tau
summary(fit.tau <- fitCopula(normalCopula(dim=3, dispstr="ar1"), u, method="itau"))
## Inverting Spearman's rho
summary(fit.rho <- fitCopula(normalCopula(dim=3, dispstr="ar1"), u, method="irho"))
## Maximum pseudo-likelihood
summary(fit.mpl <- fitCopula(normalCopula(dim=3, dispstr="ar1"), u, method="mpl"))
## Maximum likelihood (use 'x', not 'u' !)
fit.ml <- fitCopula(normalCopula(dim=3, dispstr="ar1"), x, method="ml")
summary(fit.ml)
## A t copula with variable df (df.fixed=FALSE)
(tCop <- tCopula(c(0.2,0.4,0.6), dim=3, dispstr="un", df=5))
set.seed(101)
x <- rCopula(n, tCop) # "true" observations (simulated)
## Maximum likelihood (start = (rho[1:3], df))
summary(tc.ml <- fitCopula(tCopula(dim=3, dispstr="un"), x, method="ml",
start = c(0,0,0, 10)))
## Maximum pseudo-likelihood (the asymptotic variance cannot be estimated)
u <- pobs(x) # pseudo-observations
tc.mpl <- fitCopula(tCopula(dim=3, dispstr="un"),
u, method="mpl", estimate.variance=FALSE,
start= c(0,0,0, 10))
summary(tc.mpl)