| mleDb {dbd} | R Documentation |
Maximum likelihood estimates of db parameters.
Description
Calculates maximum likelihood estimates of the alpha and
beta parameters of a db distribution. Calls upon
optim() with the "BFGS" method.
Usage
mleDb(x, ntop, zeta=FALSE, par0=NULL, UB=10, maxit=1000,
covmat=TRUE, useGinv=FALSE)
Arguments
x |
A random sample from the db distribution whose parameters are being estimated. Missing values are allowed. |
ntop |
The |
zeta |
See |
par0 |
Optional starting values for the iterative estimation procedure.
A vector with entries |
UB |
Positive numeric scalar, providing an upper bound on the starting
values used by |
maxit |
Integer scalar. The maximum number of iterations to be undertaken
by |
covmat |
Logical scalar. Should the covariance matrix of the parameter
estimates be calculated? In simulation studies, in which the
covariance matrix is not of interest, calculations might be
speeded up a bit by setting |
useGinv |
Logical scalar. Should the |
Details
The ntop and zeta parameters must be supplied; they
are not formally estimated (although the choice of ntop
may be influenced by the data — see below). The parameter
zeta has a default value, FALSE.
If the generating mechanism from which the observed data x
arose has a (known) theoretical least upper bound, then ntop
should probably be set equal to this upper bound. If the data
are theoretically unbounded, then ntop should probably
be set equal to 1 + max(x). In this case \Pr(X =
\textrm{ntop}) should probably be interpreted
as \Pr(X \geq \textrm{ntop}). Otherwise
ntop should should probably be set equal to max(x).
The choice depends on circumstances and is up to the user.
Missing values are removed from x before it is passed to
optim(). (Note that ddb() doesn't mind missing
values but returns missing values when evaluated at them.
This in turn produces a missing value for the log likelihood.)
In previous versions of this package (0.1-17 and earlier)
optim() was called with method "L-BFGS-B".
The change was made possible by the fact that, with the new
“direct” version of ddb(), it is no longer
necessary to bound the parameters away from (above) zero.
Value
An object of class "mleDb". Such an object consists of a
named vector with entries "alpha" and "beta", which
are the estimates of the corresponding parameters. It has a
number of attributes:
-
"ntop"The value of thentopargument. -
"zeta"The value of thezetaargument. -
"log.like"The (maximised) value of the log likelihood of the data. -
"covMat"An estimate of the (2 \times 2) covariance matrix of the parameter estimates. This is formed as the inverse of the hessian (of the negative log likelihood) calculated byaHess(). -
ndataThe number of non-missing values in the data set for which the likelihood was maximised, i.e.sum(!is.na(x)).
Author(s)
Rolf Turner r.turner@auckland.ac.nz
See Also
ddb
meDb()
optim()
aHess()
vcov.mleDb()
Examples
set.seed(42)
x <- rdb(500,3,5,2)
ests <- mleDb(x,2) # Bad! Mind you, 2 is a "bad" value for ntop!
# Hessian is singular; covMat is NA.
# Get much better results using true parameter values
# as starting values; pity we can't do this in real life!
ests <- mleDb(x,2,par0=c(alpha=3,beta=5))
x <- rdb(500,3,5,20)
ests <- mleDb(x,20) # Pretty good.
print(vcov(ests))
# Binomial, n = 10, p = 0.3.
set.seed(42)
x <- rbinom(1000,10,0.3)
fit <- mleDb(x,10,zeta=TRUE)
print(vcov(fit))
p1 <- dbinom(0:10,10,0.3)
p2 <- dbinom(0:10,10,mean(x)/10)
p3 <- table(factor(x,levels=0:10))/1000
plot(fit,obsd=x,legPos=NULL,ylim=c(0,max(p1,p2,p3,
ddb(0:10,fit[1],fit[2],10,zeta=TRUE))))
lines(0.2+(0:10),p1,col="orange",type="h",ylim=c(0,max(p1,p2)))
lines(0.3+(0:10),p2,col="green",type="h")
legend("topright",lty=1,col=c("red","blue","orange","green"),
legend=c("db","observed","true binomial","fitted binomial"),bty="n")
print(attr(fit,"log.like")) # -1778.36
print(sum(dbinom(x,10,mean(x)/10,log=TRUE))) # -1777.36
# Slightly better fit with only one estimated parameter,
# but then binomial is the true distribution, so you'd
# kind of expect a better fit.
print(sum(dbinom(x,10,0.3,log=TRUE))) # -1778.37
# Poisson mean = 5
set.seed(42)
x <- rpois(1000,5)
fit <- mleDb(x,14,zeta=TRUE) # max(x) = 13, take ntop = 1+13
print(vcov(fit))
p1 <- c(dpois(0:13,5),1-ppois(13,5))
lhat <- mean(x)
p2 <- c(dpois(0:13,lhat),1-ppois(13,lhat))
plot(fit,obsd=x,legPos=NULL,ylim=c(0,max(p1,p2,p3,
ddb(0:14,fit[1],fit[2],14,zeta=TRUE))))
lines(0.2+0:14,p1,col="orange",type="h")
lines(0.3+(0:14),p2,col="green",type="h")
legend("topright",lty=1,col=c("red","blue","orange","green"),
legend=c("db","observed","true Poisson","fitted Poisson"),bty="n")
print(attr(fit,"log.like")) # -2198.594
print(sum(dpois(x,lhat,log=TRUE))) # -2197.345
# Slightly better fit with only one estimated parameter,
# but then Poisson is the true distribution, so you'd
# kind of expect a better fit.
print(sum(dpois(x,5,log=TRUE))) # -2198.089