regkienerLX {FatTailsR} | R Documentation |
Regression Function for Kiener Distributions
Description
One function to estimate the parameters of Kiener distributions K1, K2, K3 and K4 and display the results in a list with many data.frame ready to use for plotting. This function performs an unweighted nonlinear regression of the logit of the empirical probabilities logit(p) on the quantiles X.
Usage
regkienerLX(X, model = "K4", pdgts = c(3, 3, 1, 1, 1, 3, 2, 4, 4, 2, 2),
maxk = 10, mink = 0.2, app = 0, probak = pprobs2, dgts = NULL,
exfitk = NULL)
Arguments
X |
vector of quantiles. |
model |
the model used for the regression: "K1", "K2", "K3", "K4". |
pdgts |
vector of length 11. Control the rounding of output parameters. |
maxk |
numeric. The maximum value of tail parameter |
mink |
numeric. The minimum value of tail parameter |
app |
numeric. The parameter " |
probak |
vector of probabilities used in output regk$fitk.
For instance |
dgts |
rounding parameter applied globally to output regk$fitk. |
exfitk |
character. A vector of parameter names to subset regk$fitk.
For instance |
Details
This function is designed to estimate the parameters of Kiener distributions
for a given dataset. It encapsulates the four distributions described in
this package.
"K1" uses model lqkiener1
, "K2" uses model lqkiener2
,
"K3" uses model lqkiener3
and "K4" uses model lqkiener4
.
A typical input is a numeric vector that describes the returns of a stock.
Conversion from a (possible) time series format to a sorted numeric vector
is done automatically and without any check of the initial format.
There is also no check of missing values, Na
, NaN
,
-Inf
, +Inf
.
Empirical probabilities of each point in the sorted dataset is calculated
with the function ppoints
. The parameter app
corresponds to the parameter a
in ppoints
but has been
limited to the range (0, 0.5). Default value is 0 as large datasets are
very common in finance.
A nonlinear regression is performed with nlsLM
from the logit of the probabilities logit(p)
over the quantiles X
with one of the functions lqkiener1234
.
These functions have been selected as they
have an explicit form in the four types (this is unfortunately not the case
for dkiener234
) and return satisfactory results with ordinary least
squares. The median is calculated before the regression and is injected
as a mandatory value in the regression function.
Kiener distributions use the following parameters, some of them being redundant.
See aw2k
and pk2pk
for the formulas and
the conversion between parameters:
-
m
(mu) is the median of the distribution. -
g
(gamma) is the scale parameter. -
a
(alpha) is the left tail parameter. -
k
(kappa) is the harmonic mean ofa
andw
and describes a global tail parameter. -
w
(omega) is the right tail parameter. -
d
(delta) is the distortion parameter. -
e
(epsilon) is the eccentricity parameter.
Where:
c(m, g, k) of length 3 for distribution "K1".
c(m, g, a, w) of length 4 for distribution "K2".
c(m, g, k, d) of length 4 for distribution "K3".
c(m, g, k, e) of length 4 for distribution "K4".
c(m, g, a, k, w, d, e) of length 7 extracted from object of class
clregk
likeregkienerLX
(typically"reg$coefk"
).
Model "K1"
return results with 1+2=3 parameters and describes a
(assumed) symmetric distribution. Parameters d
and e
are set
to 0. Models "K2"
, "K3"
and "K4"
describe asymmetric
distributions. They return results with 1+3=4 parameters.
Model "K2" has a very clear parameter definition but unfortunately
parameters a
and w
are highly correlated.
Model "K3"
has the least correlated parameters but the meaning of
the distortion parameter d
, usually of order 1e-3, is not simple.
Model "K4"
exhibits a reasonable correlation between each parameter
and should be the preferred intermediate model between "K1" and "K2" models.
The eccentricity parameter e
is well defined and easy to understand:
e=(a-w)/(a+w)
, a=k/(1-e)
and w=k/(1+e)
. It varies between
-1
and +1
and can be understood as a percentage (if times 100)
of eccentricty. e = -1
corresponds to w = infinity
,
e = +1
corresponds to a = infinity
and the model becomes a single
log-logistic funtion with a right / left stopping point and a left / right tail.
Tail parameter lower and upper values are controlled by maxk
and
mink
. An upper value maxk = 10
is appropriate for datasets
of low and medium size, less than 50.000 points. For larger datasets, the
upper limit can be extended up to maxk = 20
. Such a limit returns
results which are very closed to the logistic distribution, an alternate
distribution which could be more appropriate. The lower limit mink
is intended to avoid the value k=0
. Remind
that value k < 2
describes distribution with no stable variance and
k < 1
describes distribution with no stable mean.
The output is an object in a flat format of class clregk
. It can be
listed with the function attributes
.
First are the data.frames with the initial data and the estimated results.
Second is the result of the regression
regk0
given bynlsLM
from which a few information have been extracted and listed here.Third are the regression parameters (without the median) in plain format (no rounding), the variance-covariance matrix, the variance-covariance matrix times 1e+6 and the correlation matrix in a rounded format. Note that
regk0
,coefk0
,coefk0tt
,vcovk0
,mcork0
have a polymorphic format and changing parameters that depend from the selected model: "K1", "K2", "K3", "K4". They should be used with care in subsequent calculations.Fourth are the distribution parameters tailored to every model "K1", "K2", "K3", "K4" plus estimated quantiles at levels: c(0.001, 0.005, 0.01, 0.05, 0.5, 0.95, 0.99, 0.995, 0.999). They are intended to subsequent calculations.
-
Fifth are the same parameters presented in a more readable format thanks to the vector
pdgts
which controls the rounding of the parameters in the following order: -
pdgts = c("m","g","a","k","w","d","e","vcovk0","vcovk0m","mcork0","quantr")
. Sixth are some probabilities and the corresponding estimated quantiles and estimated Expected Shortfall stored in a data.frame format.
Last is
fitk
which returns all parameters in the same format thanfitkienerX
, eventually subsetted byexfitk
. IMPORTANT : if you need to subsetfitk
, always subset it by parameter names and never subset it by rank number as new items may be added in the future. Use for instanceexfitk =
exfit0
, ...,exfit7
.
Value
dfrXP |
data.frame. X = initial quantiles. P = empirical probabilities. |
dfrXL |
data.frame. X = initial quantiles. L = logit of probabilities. |
dfrXR |
data.frame. X = initial quantiles. R = residuals after regression. |
dfrEP |
data.frame. E = estimated quantiles. P = probabilities. |
dfrEL |
data.frame. E = estimated quantiles. L = logit of probabilities. |
dfrED |
data.frame. E = estimated quantiles. D = estimated density (from probabilities). |
regk0 |
object of class |
coefk0 |
the regression parameters in plain format. The median is out of the regression. |
vcovk0 |
rounded variance-covariance matrix. |
vcovk0m |
rounded 1e+6 times variance-covariance matrix. |
mcork0 |
rounded correlation matrix. |
coefk |
all parameters in plain format. |
coefk1 |
parameters for model "K1". |
coefk2 |
parameters for model "K2". |
coefk3 |
parameters for model "K3". |
coefk4 |
parameters for model "K4". |
quantk |
quantiles of interest. |
coefr |
all parameters in a rounded format. |
coefr1 |
rounded parameters for model "K1". |
coefr2 |
rounded parameters for model "K2". |
coefr3 |
rounded parameters for model "K3". |
coefr4 |
rounded parameters for model "K4". |
quantr |
quantiles of interest in a rounded format. |
dfrQkPk |
data.frame. Qk = Estimated quantiles of interest. Pk = probabilities. |
dfrQkLk |
data.frame. Qk = Estimated quantiles of interest. Lk = Logit of probabilities. |
dfrESkPk |
data.frame. ESk = Estimated Expected Shortfall. Pk = probabilities. |
dfrESkLk |
data.frame. ESk = Estimated Expected Shortfall. Lk = Logit of probabilities. |
fitk |
Parameters, quantiles, moments, VaR, ES and other parameters (not rounded).
Length of |
See Also
nlsLM
, laplacegaussnorm
,
Kiener distributions K1, K2, K3 and K4: kiener1
kiener2
, kiener3
, kiener4
.
Other estimation function: fitkienerX
and its derivatives.
fitk
subsetting: exfit0
.
Examples
require(graphics)
require(minpack.lm)
require(timeSeries)
### Load the datasets and select one number (1-16)
DS <- getDSdata()
j <- 5
### and run this block
X <- DS[[j]]
nameX <- names(DS)[j]
reg <- regkienerLX(X)
## Plotting
lleg <- c("logit(0.999) = 6.9", "logit(0.99) = 4.6",
"logit(0.95) = 2.9", "logit(0.50) = 0",
"logit(0.05) = -2.9", "logit(0.01) = -4.6",
"logit(0.001) = -6.9 ")
pleg <- c( paste("m =", reg$coefr4[1]), paste("g =", reg$coefr4[2]),
paste("k =", reg$coefr4[3]), paste("e =", reg$coefr4[4]) )
op <- par(mfrow=c(2,2), mgp=c(1.5,0.8,0), mar=c(3,3,2,1))
plot(X, type="l", main = nameX)
plot(reg$dfrXL, main = nameX, yaxt = "n")
axis(2, las=1, at=c(-9.2, -6.9, -4.6, -2.9, 0, 2.9, 4.6, 6.9, 9.2))
abline(h = c(-4.6, 4.6), lty = 4)
abline(v = c(reg$quantk[5], reg$quantk[9]), lty = 4)
legend("topleft", legend = lleg, cex = 0.7, inset = 0.02, bg = "#FFFFFF")
lines(reg$dfrEL, col = 2, lwd = 2)
points(reg$dfrQkLk, pch = 3, col = 2, lwd = 2, cex = 1.5)
plot(reg$dfrXP, main = nameX)
legend("topleft", legend = pleg, cex = 0.9, inset = 0.02 )
lines(reg$dfrEP, col = 2, lwd = 2)
plot(density(X), main = nameX)
lines(reg$dfrED, col = 2, lwd = 2)
round(cbind("k" = kmoments(reg$coefk, lengthx = nrow(reg$dfrXL)), "X" = xmoments(X)), 2)
## Attributes
attributes(reg)
head(reg$dfrXP)
head(reg$dfrXL)
head(reg$dfrXR)
head(reg$dfrEP)
head(reg$dfrEL)
head(reg$dfrED)
reg$regk0
reg$coefk0
reg$vcovk0
reg$vcovk0m
reg$mcork0
reg$coefk
reg$coefk1
reg$coefk2
reg$coefk3
reg$coefk4
reg$quantk
reg$coefr
reg$coefr1
reg$coefr2
reg$coefr3
reg$coefr4
reg$quantr
reg$dfrQkPk
reg$dfrQkLk
reg$dfrESkPk
reg$dfrESkLk
reg$fitk
## subset fitk
names(reg$fitk)
reg$fitk[exfit6]
reg$fitk[c(exfit1, exfit4)]
### End block