gev {VGAM} | R Documentation |
Generalized Extreme Value Regression Family Function
Description
Maximum likelihood estimation of the 3-parameter generalized extreme value (GEV) distribution.
Usage
gev(llocation = "identitylink", lscale = "loglink",
lshape = logofflink(offset = 0.5), percentiles = c(95, 99),
ilocation = NULL, iscale = NULL, ishape = NULL, imethod = 1,
gprobs.y = (1:9)/10, gscale.mux = exp((-5:5)/6),
gshape = (-5:5) / 11 + 0.01,
iprobs.y = NULL, tolshape0 = 0.001,
type.fitted = c("percentiles", "mean"),
zero = c("scale", "shape"))
gevff(llocation = "identitylink", lscale = "loglink",
lshape = logofflink(offset = 0.5), percentiles = c(95, 99),
ilocation = NULL, iscale = NULL, ishape = NULL, imethod = 1,
gprobs.y = (1:9)/10, gscale.mux = exp((-5:5)/6),
gshape = (-5:5) / 11 + 0.01,
iprobs.y = NULL, tolshape0 = 0.001,
type.fitted = c("percentiles", "mean"), zero = c("scale", "shape"))
Arguments
llocation , lscale , lshape |
Parameter link functions for For the shape parameter,
the default |
percentiles |
Numeric vector of percentiles used for the fitted values.
Values should be between 0 and 100.
This argument is ignored if |
type.fitted |
See |
ilocation , iscale , ishape |
Numeric. Initial value for the location parameter, |
imethod |
Initialization method. Either the value 1 or 2.
If both methods fail then try using |
gshape |
Numeric vector.
The values are used for a grid search for an initial value
for |
gprobs.y , gscale.mux , iprobs.y |
Numeric vectors, used for the initial values.
See |
tolshape0 |
Passed into |
zero |
A specifying which
linear/additive predictors are modelled as intercepts only.
The values can be from the set {1,2,3} corresponding
respectively to |
Details
The GEV distribution function can be written
G(y) = \exp( -[ (y-\mu)/ \sigma ]_{+}^{- 1/ \xi})
where \sigma > 0
,
-\infty < \mu < \infty
,
and 1 + \xi(y-\mu)/\sigma > 0
.
Here, x_+ = \max(x,0)
.
The \mu
, \sigma
, \xi
are known as the
location, scale and shape parameters respectively.
The cases
\xi>0
,
\xi<0
,
\xi = 0
correspond to the Frechet,
reverse
Weibull, and Gumbel types respectively.
It can be noted that the Gumbel (or Type I) distribution accommodates
many commonly-used distributions such as the normal, lognormal,
logistic, gamma, exponential and Weibull.
For the GEV distribution, the k
th moment about the mean exists
if \xi < 1/k
.
Provided they exist, the mean and variance are given by
\mu+\sigma\{ \Gamma(1-\xi)-1\}/ \xi
and
\sigma^2 \{ \Gamma(1-2\xi) - \Gamma^2(1-\xi) \} / \xi^2
respectively,
where \Gamma
is the gamma function.
Smith (1985) established that when \xi > -0.5
,
the maximum likelihood estimators are completely regular.
To have some control over the estimated \xi
try
using lshape = logofflink(offset = 0.5)
, say,
or lshape = extlogitlink(min = -0.5, max = 0.5)
, say.
Value
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions such as vglm
,
and vgam
.
Warning
Currently, if an estimate of \xi
is too close to 0 then
an error may occur for gev()
with multivariate responses.
In general, gevff()
is more reliable than gev()
.
Fitting the GEV by maximum likelihood estimation can be numerically
fraught. If 1 + \xi (y-\mu)/ \sigma \leq 0
then some crude evasive action is taken but the estimation process
can still fail. This is particularly the case if vgam
with s
is used; then smoothing is best done with
vglm
with regression splines (bs
or ns
) because vglm
implements
half-stepsizing whereas vgam
doesn't (half-stepsizing
helps handle the problem of straying outside the parameter space.)
Note
The VGAM family function gev
can handle a multivariate
(matrix) response, cf. multiple responses.
If so, each row of the matrix is sorted into
descending order and NA
s are put last.
With a vector or one-column matrix response using
gevff
will give the same result but be faster and it handles
the \xi = 0
case.
The function gev
implements Tawn (1988) while
gevff
implements Prescott and Walden (1980).
Function egev()
has been replaced by the
new family function gevff()
. It now
conforms to the usual VGAM philosophy of
having M1
linear predictors per (independent) response.
This is the usual way multiple responses are handled.
Hence vglm(cbind(y1, y2)..., gevff, ...)
will have
6 linear predictors and it is possible to constrain the
linear predictors so that the answer is similar to gev()
.
Missing values in the response of gevff()
will be deleted;
this behaviour is the same as with almost every other
VGAM family function.
The shape parameter \xi
is difficult to estimate
accurately unless there is a lot of data.
Convergence is slow when \xi
is near -0.5
.
Given many explanatory variables, it is often a good idea
to make sure zero = 3
.
The range restrictions of the parameter \xi
are not
enforced; thus it is possible for a violation to occur.
Successful convergence often depends on having a reasonably good initial
value for \xi
. If failure occurs try various values for the
argument ishape
, and if there are covariates,
having zero = 3
is advised.
Author(s)
T. W. Yee
References
Yee, T. W. and Stephenson, A. G. (2007). Vector generalized linear and additive extreme value models. Extremes, 10, 1–19.
Tawn, J. A. (1988). An extreme-value theory model for dependent observations. Journal of Hydrology, 101, 227–250.
Prescott, P. and Walden, A. T. (1980). Maximum likelihood estimation of the parameters of the generalized extreme-value distribution. Biometrika, 67, 723–724.
Smith, R. L. (1985). Maximum likelihood estimation in a class of nonregular cases. Biometrika, 72, 67–90.
See Also
rgev
,
gumbel
,
gumbelff
,
guplot
,
rlplot.gevff
,
gpd
,
weibullR
,
frechet
,
extlogitlink
,
oxtemp
,
venice
,
CommonVGAMffArguments
.
Examples
## Not run:
# Multivariate example
fit1 <- vgam(cbind(r1, r2) ~ s(year, df = 3), gev(zero = 2:3),
data = venice, trace = TRUE)
coef(fit1, matrix = TRUE)
head(fitted(fit1))
par(mfrow = c(1, 2), las = 1)
plot(fit1, se = TRUE, lcol = "blue", scol = "forestgreen",
main = "Fitted mu(year) function (centered)", cex.main = 0.8)
with(venice, matplot(year, depvar(fit1)[, 1:2], ylab = "Sea level (cm)",
col = 1:2, main = "Highest 2 annual sea levels", cex.main = 0.8))
with(venice, lines(year, fitted(fit1)[,1], lty = "dashed", col = "blue"))
legend("topleft", lty = "dashed", col = "blue", "Fitted 95 percentile")
# Univariate example
(fit <- vglm(maxtemp ~ 1, gevff, data = oxtemp, trace = TRUE))
head(fitted(fit))
coef(fit, matrix = TRUE)
Coef(fit)
vcov(fit)
vcov(fit, untransform = TRUE)
sqrt(diag(vcov(fit))) # Approximate standard errors
rlplot(fit)
## End(Not run)