glmReserve {ChainLadder} | R Documentation |
GLM-based Reserving Model
Description
This function implements loss reserving models within the generalized linear model framework. It takes accident year and development lag as mean predictors in estimating the ultimate loss reserves, and provides both analytical and bootstrapping methods to compute the associated prediction errors. The bootstrapping approach also generates the full predictive distribution for loss reserves.
Usage
glmReserve(triangle, var.power = 1, link.power = 0, cum = TRUE,
mse.method = c("formula", "bootstrap"), nsim = 1000, nb = FALSE, ...)
Arguments
triangle |
An object of class |
var.power |
The index (p) of the power variance function |
link.power |
The index of power link function. The default |
cum |
A logical value indicating whether the input triangle is on the
cumulative or the incremental scale. If |
mse.method |
A character indicating whether the prediction error should be computed analytically ( |
nsim |
Number of simulations to be performed in the bootstrapping, with a default value of 1000. |
nb |
Whether the negative binomial distribution is used. If |
... |
Arguments to be passed onto the function |
Details
This function takes an insurance loss triangle, converts it to incremental losses internally if necessary, transforms it to the long format (see as.data.frame
) and fits the resulting loss data with a generalized linear model where the mean structure includes both the accident year and the development lag effects.
The distributions allowed are the exponential family that admits a power variance function, that is, V(\mu)=\mu^p
.
This subclass of distributions is usually called the Tweedie distribution and includes many commonly used distributions as special cases.
This function does not allow the user to specify the GLM options through the usual family
argument, but instead, it uses the tweedie
family internally and takes two arguments, var.power
and link.power
, through which the user still has full control of the distribution forms and link functions.
The argument var.power
determines which specific distribution is to be used, and link.power
determines the form of the link function.
When the Tweedie compound Poisson distribution 1 < p < 2
is to be used, the user has the option to specify var.power = NULL
, where the variance power p
will be estimated from the data using the cplm
package. The bcplm
function in the cplm
package also has an example for the Bayesian compound Poisson loss reserving model.
See details in tweedie
, cpglm
and bcplm
.
glmReserve
allows certain measures of exposures to be used in an offset term in the underlying GLM.
To do this, the user should not use the usual offset
argument in glm
.
Instead, one specifies the exposure measure for each accident year through the exposure
attribute of triangle
.
Make sure that these exposures are in the original scale (no log transformations for example).
If the vector is named, make sure the names coincide with the rownames/origin of the triangle.
If the vector is unnamed, make sure the exposures are in the order consistent with the accident years, and the character rownames of the Triangle must be convertible to numeric.
If the exposure
attribute is not NULL
, the glmReserve
function will use these exposures, link-function-transformed, in the offset term of the GLM.
For example, if the link function is log
, then the log of the exposure is used as the offset, not the original exposure.
See the examples below.
Moreover, the user MUST NOT supply the typical offset
or weight
as arguments in the list of additional arguments ...
. offset
should be specified as above, while weight
is not implemented (due to prediction reasons).
Two methods are available to assess the prediction error of the estimated loss reserves.
One is using the analytical formula (mse.method = "formula"
) derived from the first-order Taylor approximation.
The other is using bootstrapping (mse.method = "bootstrap"
) that reconstructs the triangle nsim
times by sampling with replacement from the GLM (Pearson) residuals.
Each time a new triangle is formed, GLM is fitted and corresponding loss reserves are generated.
Based on these predicted mean loss reserves, and the model assumption about the distribution forms, realizations of the predicted values are generated via the rtweedie
function.
Prediction errors as well as other uncertainty measures such as quantiles and predictive intervals can be calculated based on these samples.
Value
The output is an object of class "glmReserve"
that has the following components:
call |
the matched call. |
summary |
A data frame containing the predicted loss reserve statistics. Similar to the summary statistics from |
Triangle |
The input triangle. |
FullTriangle |
The completed triangle, where empty cells in the original triangle are filled with model predictions. |
model |
The fitted GLM, a class of |
sims.par |
a matrix of the simulated parameter values in the bootstrapping. |
sims.reserve.mean |
a matrix of the simulated mean loss reserves (without the process variance) for each year in the bootstrapping. |
sims.par |
a matrix of the simulated realizations of the loss reserves (with the process variance) for each year in the bootstrapping. This can be used to summarize the predictive uncertainty of the loss reserves. |
Note
The use of GLM in insurance loss reserving has many compelling aspects, e.g.,
when over-dispersed Poisson model is used, it reproduces the estimates from Chain Ladder;
it provides a more coherent modeling framework than the Mack method;
all the relevant established statistical theory can be directly applied to perform hypothesis testing and diagnostic checking;
However, the user should be cautious of some of the key assumptions that underlie the GLM model, in order to determine whether this model is appropriate for the problem considered:
the GLM model assumes no tail development, and it only projects losses to the latest time point of the observed data. To use a model that enables tail extrapolation, please consider the growth curve model
ClarkLDF
orClarkCapeCod
;the model assumes that each incremental loss is independent of all the others. This assumption may not be valid in that cells from the same calendar year are usually correlated due to inflation or business operating factors;
the model tends to be over-parameterized, which may lead to inferior predictive performance.
To solve these potential problems, many variants of the current basic GLM model have been proposed in the actuarial literature. Some of these may be included in the future release.
Support of the negative binomial GLM was added since version 0.2.3.
Author(s)
Wayne Zhang actuary_zhang@hotmail.com
References
England P. and Verrall R. (1999). Analytic and bootstrap estimates of prediction errors in claims reserving. Insurance: Mathematics and Economics, 25, 281-293.
See Also
See also glm
, glm.nb
, tweedie
, cpglm
and MackChainLadder
.
Examples
data(GenIns)
GenIns <- GenIns / 1000
# over-dispersed Poisson: reproduce ChainLadder estimates
(fit1 <- glmReserve(GenIns))
summary(fit1, type = "model") # extract the underlying glm
# which:
# 1 Original triangle
# 2 Full triangle
# 3 Reserve distribution
# 4 Residual plot
# 5 QQ-plot
# plot original triangle
plot(fit1, which = 1, xlab = "dev year", ylab = "cum loss")
# plot residuals
plot(fit1, which = 4, xlab = "fitted values", ylab = "residuals")
# Gamma GLM:
(fit2 <- glmReserve(GenIns, var.power = 2))
# compound Poisson GLM (variance function estimated from the data):
(fit3 <- glmReserve(GenIns, var.power = NULL))
# Now suppose we have an exposure measure
# we can put it as an offset term in the model
# to do this, use the "exposure" attribute of the 'triangle'
expos <- (7 + 1:10 * 0.4) * 100
GenIns2 <- GenIns
attr(GenIns2, "exposure") <- expos
(fit4 <- glmReserve(GenIns2))
# If the triangle's rownames are not convertible to numeric,
# supply names to the exposures
GenIns3 <- GenIns2
rownames(GenIns3) <- paste0(2007:2016, "-01-01")
names(expos) <- rownames(GenIns3)
attr(GenIns3, "exposure") <- expos
(fit4b <- glmReserve(GenIns3))
# use bootstrapping to compute prediction error
## Not run:
set.seed(11)
(fit5 <- glmReserve(GenIns, mse.method = "boot"))
# compute the quantiles of the predicted loss reserves
t(apply(fit5$sims.reserve.pred, 2, quantile,
c(0.025, 0.25, 0.5, 0.75, 0.975)))
# plot distribution of reserve
plot(fit5, which = 3)
## End(Not run)
# alternative over-dispersed Poisson: negative binomial GLM
(fit6 <- glmReserve(GenIns, nb = TRUE))