extbetabinomial {VGAM} | R Documentation |
Extended Beta-binomial Distribution Family Function
Description
Fits an extended beta-binomial distribution by maximum likelihood estimation. The two parameters here are the mean and correlation coefficient.
Usage
extbetabinomial(lmu = "logitlink", lrho = "cloglink",
zero = "rho", irho = 0, grho = c(0, 0.05, 0.1, 0.2),
vfl = FALSE, Form2 = NULL,
imethod = 1, ishrinkage = 0.95)
Arguments
lmu , lrho |
Link functions applied to the two parameters.
See |
irho , grho |
The first is similar to |
imethod |
Similar to |
zero |
Similar to |
ishrinkage |
See |
vfl , Form2 |
See |
Details
The extended beta-binomial
distribution (EBBD) proposed
by Prentice (1986)
allows for a slightly negative correlation
parameter whereas the ordinary BBD
betabinomial
only allows values in so it
handles overdispersion only.
When negative, the data is underdispersed
relative to an ordinary binomial distribution.
Argument rho
is used here for the
used in Prentice (1986) because
it is the correlation between the
(almost) Bernoulli trials.
(They are actually simple binary variates.)
We use here
for the number of trials
(e.g., litter size),
is the number of successes, and
(or
)
is the probability of a success
(e.g., a malformation).
That is,
is the proportion
of successes. Like
binomialff
, the fitted values
are the
estimated probability
of success
(i.e., and not
)
and the prior weights
are attached
separately on the object in a slot.
The probability function is difficult
to write but it involves three
series of products.
Recall is the real response
being modelled, where
is the
(total) sum of
correlated
(almost) Bernoulli trials.
The default model is
and
because the first
parameter lies between 0 and 1.
The second link is
cloglink
.
The mean of is
and the variance of
is
.
Here, the correlation
may be slightly negative
and is the correlation between the
individuals within a litter.
A litter effect is typically reflected
by a positive value of
and
corresponds to overdispersion with
respect to the binomial distribution.
Thus an exchangeable error structure
is assumed between units within a litter
for the EBBD.
This family function uses Fisher scoring.
Elements of the second-order expected
derivatives
are computed numerically, which may
fail for models very near the boundary of the
parameter space.
Usually, the computations
are expensive for large because of
a
for
loop, so
it may take a long time.
Value
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions
such as vglm
.
Suppose fit
is a fitted EBB
model. Then depvar(fit)
are the sample proportions ,
fitted(fit)
returns
estimates of ,
and
weights(fit, type = "prior")
returns
the number of trials .
Warning
Modelling rho
using covariates well
requires much data
so it is usually best to leave zero
alone.
It is good to set trace = TRUE
and
play around with irho
if there are
problems achieving convergence.
Convergence problems will occur when the
estimated rho
is close to the
lower bound,
i.e., the underdispersion
is almost too severe for the EBB to cope.
Note
This function is recommended over
betabinomial
and
betabinomialff
.
It processes the input in the
same way as binomialff
.
But it does not handle the case
very well because there are two parameters to
estimate, not one, for each row of the input.
Cases where
can be selected via the
subset
argument of vglm
.
Author(s)
T. W. Yee
References
Prentice, R. L. (1986). Binary regression using an extended beta-binomial distribution, with discussion of correlation induced by covariate measurement errors. Journal of the American Statistical Association, 81, 321–327.
See Also
Extbetabinom
,
betabinomial
,
betabinomialff
,
binomialff
,
dirmultinomial
,
cloglink
,
lirat
.
Examples
# Example 1
edata <- data.frame(N = 10, mu = 0.5, rho = 0.1)
edata <- transform(edata,
y = rextbetabinom(100, N, mu, rho = rho))
fit1 <- vglm(cbind(y, N-y) ~ 1, extbetabinomial, edata, trace = TRUE)
coef(fit1, matrix = TRUE)
Coef(fit1)
head(cbind(depvar(fit1), weights(fit1, type = "prior")))
# Example 2: VFL model
## Not run: N <- size1 <- 10; nn <- 2000; set.seed(1)
edata <- # Generate the data set. Expensive.
data.frame(x2 = runif(nn),
ooo = log(size1 / (size1 - 1)))
edata <- transform(edata, x1copy = 1, x2copy = x2,
y2 = rextbetabinom(nn, size1, # Expensive
logitlink(1 + x2, inverse = TRUE),
cloglink(ooo + 1 - 0.5 * x2, inv = TRUE)))
fit2 <- vglm(data = edata,
cbind(y2, N - y2) ~ x2 + x1copy + x2copy,
extbetabinomial(zero = NULL, vfl = TRUE,
Form2 = ~ x1copy + x2copy - 1),
offset = cbind(0, ooo), trace = TRUE)
coef(fit2, matrix = TRUE)
wald.stat(fit2, values0 = c(1, 1, -0.5))
## End(Not run)