mleDb {dbd} | R Documentation |
Maximum likelihood estimates of db parameters.
Description
Calculates maximum likelihood estimates of the alpha
and
beta
parameters of a db distribution. Calls upon
optim()
with the "BFGS"
method.
Usage
mleDb(x, ntop, zeta=FALSE, par0=NULL, UB=10, maxit=1000,
covmat=TRUE, useGinv=FALSE)
Arguments
x |
A random sample from the db distribution whose parameters are being estimated. Missing values are allowed. |
ntop |
The |
zeta |
See |
par0 |
Optional starting values for the iterative estimation procedure.
A vector with entries |
UB |
Positive numeric scalar, providing an upper bound on the starting
values used by |
maxit |
Integer scalar. The maximum number of iterations to be undertaken
by |
covmat |
Logical scalar. Should the covariance matrix of the parameter
estimates be calculated? In simulation studies, in which the
covariance matrix is not of interest, calculations might be
speeded up a bit by setting |
useGinv |
Logical scalar. Should the |
Details
The ntop
and zeta
parameters must be supplied; they
are not formally estimated (although the choice of ntop
may be influenced by the data — see below). The parameter
zeta
has a default value, FALSE
.
If the generating mechanism from which the observed data x
arose has a (known) theoretical least upper bound, then ntop
should probably be set equal to this upper bound. If the data
are theoretically unbounded, then ntop
should probably
be set equal to 1 + max(x)
. In this case \Pr(X =
\textrm{ntop})
should probably be interpreted
as \Pr(X \geq \textrm{ntop})
. Otherwise
ntop
should should probably be set equal to max(x)
.
The choice depends on circumstances and is up to the user.
Missing values are removed from x
before it is passed to
optim()
. (Note that ddb()
doesn't mind missing
values but returns missing values when evaluated at them.
This in turn produces a missing value for the log likelihood.)
In previous versions of this package (0.1-17 and earlier)
optim()
was called with method "L-BFGS-B"
.
The change was made possible by the fact that, with the new
“direct” version of ddb()
, it is no longer
necessary to bound the parameters away from (above) zero.
Value
An object of class "mleDb"
. Such an object consists of a
named vector with entries "alpha"
and "beta"
, which
are the estimates of the corresponding parameters. It has a
number of attributes:
-
"ntop"
The value of thentop
argument. -
"zeta"
The value of thezeta
argument. -
"log.like"
The (maximised) value of the log likelihood of the data. -
"covMat"
An estimate of the (2 \times 2
) covariance matrix of the parameter estimates. This is formed as the inverse of the hessian (of the negative log likelihood) calculated byaHess()
. -
ndata
The number of non-missing values in the data set for which the likelihood was maximised, i.e.sum(!is.na(x))
.
Author(s)
Rolf Turner r.turner@auckland.ac.nz
See Also
ddb
meDb()
optim()
aHess()
vcov.mleDb()
Examples
set.seed(42)
x <- rdb(500,3,5,2)
ests <- mleDb(x,2) # Bad! Mind you, 2 is a "bad" value for ntop!
# Hessian is singular; covMat is NA.
# Get much better results using true parameter values
# as starting values; pity we can't do this in real life!
ests <- mleDb(x,2,par0=c(alpha=3,beta=5))
x <- rdb(500,3,5,20)
ests <- mleDb(x,20) # Pretty good.
print(vcov(ests))
# Binomial, n = 10, p = 0.3.
set.seed(42)
x <- rbinom(1000,10,0.3)
fit <- mleDb(x,10,zeta=TRUE)
print(vcov(fit))
p1 <- dbinom(0:10,10,0.3)
p2 <- dbinom(0:10,10,mean(x)/10)
p3 <- table(factor(x,levels=0:10))/1000
plot(fit,obsd=x,legPos=NULL,ylim=c(0,max(p1,p2,p3,
ddb(0:10,fit[1],fit[2],10,zeta=TRUE))))
lines(0.2+(0:10),p1,col="orange",type="h",ylim=c(0,max(p1,p2)))
lines(0.3+(0:10),p2,col="green",type="h")
legend("topright",lty=1,col=c("red","blue","orange","green"),
legend=c("db","observed","true binomial","fitted binomial"),bty="n")
print(attr(fit,"log.like")) # -1778.36
print(sum(dbinom(x,10,mean(x)/10,log=TRUE))) # -1777.36
# Slightly better fit with only one estimated parameter,
# but then binomial is the true distribution, so you'd
# kind of expect a better fit.
print(sum(dbinom(x,10,0.3,log=TRUE))) # -1778.37
# Poisson mean = 5
set.seed(42)
x <- rpois(1000,5)
fit <- mleDb(x,14,zeta=TRUE) # max(x) = 13, take ntop = 1+13
print(vcov(fit))
p1 <- c(dpois(0:13,5),1-ppois(13,5))
lhat <- mean(x)
p2 <- c(dpois(0:13,lhat),1-ppois(13,lhat))
plot(fit,obsd=x,legPos=NULL,ylim=c(0,max(p1,p2,p3,
ddb(0:14,fit[1],fit[2],14,zeta=TRUE))))
lines(0.2+0:14,p1,col="orange",type="h")
lines(0.3+(0:14),p2,col="green",type="h")
legend("topright",lty=1,col=c("red","blue","orange","green"),
legend=c("db","observed","true Poisson","fitted Poisson"),bty="n")
print(attr(fit,"log.like")) # -2198.594
print(sum(dpois(x,lhat,log=TRUE))) # -2197.345
# Slightly better fit with only one estimated parameter,
# but then Poisson is the true distribution, so you'd
# kind of expect a better fit.
print(sum(dpois(x,5,log=TRUE))) # -2198.089