R: Generalized Beta Distribution of the Second Kind

genbetaII {VGAM}

R Documentation

Generalized Beta Distribution of the Second Kind

Description

Maximum likelihood estimation of the 4-parameter generalized beta II distribution.

Usage

genbetaII(lscale = "loglink", lshape1.a = "loglink",
     lshape2.p = "loglink", lshape3.q = "loglink",
     iscale = NULL, ishape1.a = NULL,
     ishape2.p = NULL, ishape3.q = NULL, lss = TRUE,
     gscale = exp(-5:5), gshape1.a = exp(-5:5),
     gshape2.p = exp(-5:5), gshape3.q = exp(-5:5), zero = "shape")

Arguments

`lss`	See `CommonVGAMffArguments` for important information.
`lshape1.a`, `lscale`, `lshape2.p`, `lshape3.q`	Parameter link functions applied to the shape parameter `a`, scale parameter `scale`, shape parameter `p`, and shape parameter `q`. All four parameters are positive. See `Links` for more choices.
`iscale`, `ishape1.a`, `ishape2.p`, `ishape3.q`	Optional initial values for the parameters. A `NULL` means a value is computed internally using the arguments `gscale`, `gshape1.a`, etc.
`gscale`, `gshape1.a`, `gshape2.p`, `gshape3.q`	See `CommonVGAMffArguments` for information. Replaced by `iscale`, `ishape1.a` etc. if given.
`zero`	The default is to set all the shape parameters to be intercept-only. See `CommonVGAMffArguments` for information.

Details

This distribution is most useful for unifying a substantial number of size distributions. For example, the Singh-Maddala, Dagum, Fisk (log-logistic), Lomax (Pareto type II), inverse Lomax, beta distribution of the second kind distributions are all special cases. Full details can be found in Kleiber and Kotz (2003), and Brazauskas (2002). The argument names given here are used by other families that are special cases of this family. Fisher scoring is used here and for the special cases too.

The 4-parameter generalized beta II distribution has density

f(y) = a y^{ap-1} / [b^{ap} B(p,q) \{1 + (y/b)^a\}^{p+q}]

for a > 0, b > 0, p > 0, q > 0, y \geq 0. Here B is the beta function, and b is the scale parameter scale, while the others are shape parameters. The mean is

E(Y) = b \, \Gamma(p + 1/a) \, \Gamma(q - 1/a) / (\Gamma(p) \, \Gamma(q))

provided -ap < 1 < aq; these are returned as the fitted values.

This family function handles multiple responses.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

Warning

This distribution is very flexible and it is not generally recommended to use this family function when the sample size is small—numerical problems easily occur with small samples. Probably several hundred observations at least are needed in order to estimate the parameters with any level of confidence. Neither is the inclusion of covariates recommended at all—not unless there are several thousand observations. The mean is finite only when -ap < 1 < aq, and this can be easily violated by the parameter estimates for small sample sizes. Try fitting some of the special cases of this distribution (e.g., sinmad, fisk, etc.) first, and then possibly use those models for initial values for this distribution.

Note

The default is to use a grid search with respect to all four parameters; this is quite costly and is time consuming. If the self-starting initial values fail, try experimenting with the initial value arguments. Also, the constraint -ap < 1 < aq may be violated as the iterations progress so it pays to monitor convergence, e.g., set trace = TRUE. Successful convergence depends on having very good initial values. This is rather difficult for this distribution so that a grid search is conducted by default. One suggestion for increasing the estimation reliability is to set stepsize = 0.5 and maxit = 100; see vglm.control.

Author(s)

T. W. Yee, with help from Victor Miranda.

References

Kleiber, C. and Kotz, S. (2003). Statistical Size Distributions in Economics and Actuarial Sciences, Hoboken, NJ, USA: Wiley-Interscience.

Brazauskas, V. (2002). Fisher information matrix for the Feller-Pareto distribution. Statistics & Probability Letters, 59, 159–167.

Examples

## Not run: 
gdata <- data.frame(y = rsinmad(3000, shape1 = exp(1), scale = exp(2),
                                shape3 = exp(1)))  # A special case!
fit <- vglm(y ~ 1, genbetaII(lss = FALSE), data = gdata, trace = TRUE)
fit <- vglm(y ~ 1, data = gdata, trace = TRUE,
            genbetaII(ishape1.a = 3, iscale = 7, ishape3.q = 2.3))
coef(fit, matrix = TRUE)
Coef(fit)
summary(fit)

## End(Not run)

[Package VGAM version 1.1-11 Index]