hyperEM {openEBGM} | R Documentation |
Estimate hyperparameters using an EM algorithm
Description
hyperEM
finds hyperparameter estimates using a variation on the
Expectation-Maximization (EM) algorithm known as the Expectation/Conditional
Maximization (ECM) algorithm (Meng et al, 1993). The algorithm estimates each
element of the hyperparameter vector, \theta
, while holding fixed
(conditioning on) the other parameters in the vector. Alternatively, it can
estimate both parameters for each distribution in the mixture while holding
the parameters from the other distribution and the mixing fraction fixed.
Usage
hyperEM(
data,
theta_init_vec,
squashed = TRUE,
zeroes = FALSE,
N_star = 1,
method = c("score", "nlminb"),
profile = c("parameter", "distribution"),
LL_tol = 1e-04,
consecutive = 100,
param_lower = 1e-05,
param_upper = 20,
print_level = 2,
max_iter = 5000,
conf_ints = FALSE,
conf_level = c("95", "80", "90", "99"),
track = FALSE
)
Arguments
data |
A data frame from |
theta_init_vec |
A numeric vector of initial hyperparameter guesses
ordered as: |
squashed |
A scalar logical ( |
zeroes |
A scalar logical specifying if zero counts are included. |
N_star |
A positive scalar whole number value for the minimum count
size to be used for hyperparameter estimation. If zeroes are used, set
|
method |
A scalar string indicating which method (i.e. score functions
or log-likelihood function) to use for the maximization steps. Possible
values are |
profile |
A scalar string indicating which method to use to optimize the
log-likelihood function if |
LL_tol |
A scalar numeric value for the tolerance used for determining when the change in log-likelihood at each iteration is sufficiently small. Used for convergence assessment. |
consecutive |
A positive scalar whole number value for the number of
consecutive iterations the change in log-likelihood must be below
|
param_lower |
A scalar numeric value for the smallest acceptable value
for each |
param_upper |
A scalar numeric value for the largest acceptable value
for each |
print_level |
A value that determines how much information is printed
during execution. Possible value are |
max_iter |
A positive scalar whole number value for the maximum number of iterations to use. |
conf_ints |
A scalar logical indicating if confidence intervals and standard errors should be returned. |
conf_level |
A scalar string for the confidence level used if confidence intervals are requested. |
track |
A scalar logical indicating whether or not to retain the hyperparameter estimates and log-likelihood value at each iteration. Can be used for plotting to better understand the algorithm's behavior. |
Details
If method = "score"
, the maximization step finds a root
of the score function. If method = "nlminb"
,
nlminb
is used to find a minimum of the negative
log-likelihood function.
If method = "score"
and zeroes = FALSE
, then
'N_star'
must equal 1
.
If method = "score"
, the model is reparameterized. The
parameters are transformed to force the parameter space to include all real
numbers. This approach addresses numerical issues at the edge of the
parameter space. The reparameterization follows:
\alpha_{prime} = log(\alpha)
, \beta_{prime} = log(\beta)
, and
P_{prime} = tan(pi * P - pi / 2)
. However, the values returned in
estimates
are on the original scale (back-transformed).
On every 100th iteration, the procedure described in Millar (2011)
is used to accelerate the estimate of \theta
.
The score vector and its Euclidean norm should be close to zero at a local maximum and can therefore be used to help assess the reliability of the results. A local maximum might not be the global MLE, however.
Asymptotic normal confidence intervals, if requested, use standard errors calculated from the observed Fisher information matrix as discussed in DuMouchel (1999).
Value
A list including the following:
estimates: The maximum likelihood estimate (MLE) of the hyperparameter,
\theta
.conf_int: A data frame including the standard errors and confidence limits for
estimates
. Only included ifconf_ints = TRUE
.maximum: The log-likelihood function evaluated at
estimates
.method: The method used in the maximization step.
elapsed: The elapsed function execution time in seconds.
iters: The number of iterations used.
score: The score functions (i.e. score vector) evaluated at
estimates
. All elements should be close to zero.score_norm: The Euclidean norm of the score vector evaluated at
estimates
. Should be close to zero.tracking: The estimates of
\theta
at each iteration and the log-likelihood function evaluated at those estimates. Unlesstrack = TRUE
, only shows the starting point of the algorithm.
References
DuMouchel W (1999). "Bayesian Data Mining in Large Frequency Tables, With an Application to the FDA Spontaneous Reporting System." The American Statistician, 53(3), 177-190.
Meng X-L, Rubin D (1993). "Maximum likelihood estimation via the ECM algorithm: A general framework", Biometrika, 80(2), 267-278.
Millar, Russell B (2011). "Maximum Likelihood Estimation and Inference", John Wiley & Sons, Ltd, 111-112.
See Also
uniroot
for finding a zero of the score
function and nlminb
for minimizing the negative
log-likelihood function
Other hyperparameter estimation functions:
autoHyper()
,
exploreHypers()
Examples
data.table::setDTthreads(2) #only needed for CRAN checks
data(caers)
proc <- processRaw(caers)
squashed <- squashData(proc, bin_size = 300, keep_pts = 10)
squashed <- squashData(squashed, count = 2, bin_size = 13, keep_pts = 10)
hypers <- hyperEM(squashed, theta_init_vec = c(1, 1, 2, 2, .3),
consecutive = 10, LL_tol = 1e-03)