extbetabinomial {VGAM} | R Documentation |
Extended Beta-binomial Distribution Family Function
Description
Fits an extended beta-binomial distribution by maximum likelihood estimation. The two parameters here are the mean and correlation coefficient.
Usage
extbetabinomial(lmu = "logitlink", lrho = "cloglink",
zero = "rho", irho = 0, grho = c(0, 0.05, 0.1, 0.2),
vfl = FALSE, Form2 = NULL,
imethod = 1, ishrinkage = 0.95)
Arguments
lmu , lrho |
Link functions applied to the two parameters.
See |
irho , grho |
The first is similar to |
imethod |
Similar to |
zero |
Similar to |
ishrinkage |
See |
vfl , Form2 |
See |
Details
The extended beta-binomial
distribution (EBBD) proposed
by Prentice (1986)
allows for a slightly negative correlation
parameter whereas the ordinary BBD
betabinomial
only allows values in (0, 1)
so it
handles overdispersion only.
When negative, the data is underdispersed
relative to an ordinary binomial distribution.
Argument rho
is used here for the
\delta
used in Prentice (1986) because
it is the correlation between the
(almost) Bernoulli trials.
(They are actually simple binary variates.)
We use here
N
for the number of trials
(e.g., litter size),
T=NY
is the number of successes, and
p
(or \mu
)
is the probability of a success
(e.g., a malformation).
That is, Y
is the proportion
of successes. Like
binomialff
, the fitted values
are the
estimated probability
of success
(i.e., E[Y]
and not E[T]
)
and the prior weights N
are attached
separately on the object in a slot.
The probability function is difficult
to write but it involves three
series of products.
Recall Y = T/N
is the real response
being modelled, where T
is the
(total) sum of N
correlated
(almost) Bernoulli trials.
The default model is
\eta_1 = logit(\mu)
and \eta_2 = clog(\rho)
because the first
parameter lies between 0 and 1.
The second link is cloglink
.
The mean of Y
is
p=\mu
and the variance of Y
is
\mu(1-\mu)(1+(N-1)\rho)/N
.
Here, the correlation \rho
may be slightly negative
and is the correlation between the N
individuals within a litter.
A litter effect is typically reflected
by a positive value of \rho
and
corresponds to overdispersion with
respect to the binomial distribution.
Thus an exchangeable error structure
is assumed between units within a litter
for the EBBD.
This family function uses Fisher scoring.
Elements of the second-order expected
derivatives
are computed numerically, which may
fail for models very near the boundary of the
parameter space.
Usually, the computations
are expensive for large N
because of
a for
loop, so
it may take a long time.
Value
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions
such as vglm
.
Suppose fit
is a fitted EBB
model. Then depvar(fit)
are the sample proportions y
,
fitted(fit)
returns
estimates of E(Y)
,
and weights(fit, type = "prior")
returns
the number of trials N
.
Warning
Modelling rho
using covariates well
requires much data
so it is usually best to leave zero
alone.
It is good to set trace = TRUE
and
play around with irho
if there are
problems achieving convergence.
Convergence problems will occur when the
estimated rho
is close to the
lower bound,
i.e., the underdispersion
is almost too severe for the EBB to cope.
Note
This function is recommended over
betabinomial
and
betabinomialff
.
It processes the input in the
same way as binomialff
.
But it does not handle the case N \leq 2
very well because there are two parameters to
estimate, not one, for each row of the input.
Cases where N > 2
can be selected via the
subset
argument of vglm
.
Author(s)
T. W. Yee
References
Prentice, R. L. (1986). Binary regression using an extended beta-binomial distribution, with discussion of correlation induced by covariate measurement errors. Journal of the American Statistical Association, 81, 321–327.
See Also
Extbetabinom
,
betabinomial
,
betabinomialff
,
binomialff
,
dirmultinomial
,
cloglink
,
lirat
.
Examples
# Example 1
edata <- data.frame(N = 10, mu = 0.5, rho = 0.1)
edata <- transform(edata,
y = rextbetabinom(100, N, mu, rho = rho))
fit1 <- vglm(cbind(y, N-y) ~ 1, extbetabinomial, edata, trace = TRUE)
coef(fit1, matrix = TRUE)
Coef(fit1)
head(cbind(depvar(fit1), weights(fit1, type = "prior")))
# Example 2: VFL model
## Not run: N <- size1 <- 10; nn <- 2000; set.seed(1)
edata <- # Generate the data set. Expensive.
data.frame(x2 = runif(nn),
ooo = log(size1 / (size1 - 1)))
edata <- transform(edata, x1copy = 1, x2copy = x2,
y2 = rextbetabinom(nn, size1, # Expensive
logitlink(1 + x2, inverse = TRUE),
cloglink(ooo + 1 - 0.5 * x2, inv = TRUE)))
fit2 <- vglm(data = edata,
cbind(y2, N - y2) ~ x2 + x1copy + x2copy,
extbetabinomial(zero = NULL, vfl = TRUE,
Form2 = ~ x1copy + x2copy - 1),
offset = cbind(0, ooo), trace = TRUE)
coef(fit2, matrix = TRUE)
wald.stat(fit2, values0 = c(1, 1, -0.5))
## End(Not run)