tolfind {npmlreg} | R Documentation |
Grid search over tol for NPML estimation of (generalized) random effect models
Description
Performs a grid search to select the parameter tol
,
which is a tuning parameter for starting point selection of the EM algorithm
for NPML estimation (see e.g. Aitkin, Hinde & Francis, 2009, p. 437)
Usage
tolfind(formula,
random = ~1,
family = gaussian(),
data,
k = 4,
random.distribution="np",
offset,
weights,
na.action,
EMmaxit = 500,
EMdev.change = 0.001,
lambda = 0,
damp = TRUE,
damp.power = 1,
spike.protect = 1,
sdev,
shape,
plot.opt = 1,
steps = 15,
find.in.range = c(0.05, 0.8),
verbose = FALSE,
noformat = FALSE,
...)
Arguments
formula |
a formula defining the response and the fixed effects (e.g. |
random |
a formula defining the random model. Set |
family |
conditional distribution of responses: "gaussian", "poisson", "binomial", "Gamma", or "inverse.gaussian" can be set. |
data |
the data frame (mandatory, even if it is attached to the workspace!). |
k |
the number of mass points/integration points (supported are up to 600 mass points). |
random.distribution |
the mixing distribution, Gaussian Quadrature
( |
offset |
an optional offset to be included in the model. |
weights |
optional prior weights for the data. |
na.action |
a function indicating what should happen when |
EMmaxit |
maximum number of EM iterations. |
EMdev.change |
stops EM algorithm when deviance change falls below this value. |
lambda |
see the help file for |
damp |
switches EM damping on or off. |
damp.power |
steers degree of damping. |
spike.protect |
see the help file for |
sdev |
optional fixed standard deviation for normal mixture. |
shape |
optional fixed shape parameter for Gamma and IG mixtures. |
plot.opt |
For |
steps |
number of grid points for the search of |
find.in.range |
range for the search of |
verbose |
If set to |
noformat |
If |
... |
further arguments which will be ignored. |
Details
The EM algorithm for NPML estimation (Aitkin, 1996) uses the Gauss-Hermite masses
and mass points as starting points. The position of the starting points can be
concentrated or extended by setting tol
smaller or larger than 1,
respectively. The tuning parameter tol
is, as in GLIM4, responsible for this scaling.
A careful selection of tol
may be necessary for some data sets.
The reason is that NPML has a tendency to get stuck in local maxima,
as the log-likelihhod function is not concave for fixed k
(Boehning, 1999).
For Gaussian, Gamma, and IG mixtures this R implementation uses by default a damping
procedure in the first cycles of the EM algorithm (Einbeck & Hinde, 2006),
which stabilizes the algorithm and makes it less sensitive to the optimal choice
of tol
. Application of tolfind
to check that the optimal solution has
not been overlooked may nevertheless be advisable.
tolfind
works for alldist
and allvc
. The tolfind
function
is mainly designed for NPML (random.distribution="np"
). It
can also be applied to Gaussian Quadrature (random.distribution="gq"
),
though tol
is of little importance for this and primarily influences
the speed of convergence.
Value
A list of 5 items:
MinDisparity |
the minimal disparity achieved (for which EM converged). |
Mintol |
the |
AllDisparities |
a vector containing all disparities calculated on the grid. |
Alltol |
all corresponding |
AllEMconverged |
a vector of Booleans indicating
if EM converged for the particular |
Author(s)
Jochen Einbeck & John Hinde (2006).
References
Aitkin, M. (1996). A general maximum likelihood analysis of overdispersion in generalized linear models. Statistics and Computing 6 , 251-262.
Aitkin, M., Francis, B. and Hinde, J. (2009). Statistical Modelling in R. Oxford Statistical Science Series, Oxford, UK.
Boehning, D. (1999). Computer-Assisted Analysis of Mixtures and Applications. Meta-Analysis, Disease Mapping and others. Chapman & Hall / CRC, Boca Raton, FL, USA.
Einbeck, J. & Hinde, J. (2006). A note on NPML estimation for exponential family regression models with unspecified dispersion parameter. Austrian Journal of Statistics 35, 233-243.
See Also
Examples
data(galaxies, package="MASS")
gal<-as.data.frame(galaxies)
tolfind(galaxies/1000~1, random=~1, k=5, data=gal, lambda=1, damp=TRUE,
find.in.range=c(0,1), steps=10)
# Minimal Disparity: 380.1444 at tol= 0.5