negLLsquash {openEBGM} | R Documentation |
Likelihood with data squashing and no zero counts
Description
negLLsquash
computes the negative log-likelihood based on the
conditional marginal distribution of the counts, N, given that
N >= N*, where N* is the smallest count used for estimating the
hyperparameters. This function is minimized to estimate the hyperparameters
of the prior distribution. Use this function when zero counts are not used
and data squashing is used as described by DuMouchel et al. (2001). This
function is the likelihood function that should usually be chosen.
Usage
negLLsquash(theta, ni, ei, wi, N_star = 1)
Arguments
theta |
A numeric vector of hyperparameters ordered as:
|
ni |
A whole number vector of squashed actual counts from
|
ei |
A numeric vector of squashed expected counts from
|
wi |
A whole number vector of bin weights from |
N_star |
A scalar whole number for the minimum count size used. |
Details
The conditional marginal distribution for the counts, N,
given that N >= N*, is based on a mixture of two negative binomial
distributions. The hyperparameters for the prior distribution (mixture of
gammas) are estimated by optimizing the likelihood equation from this
conditional marginal distribution. It is recommended to use N_star =
1
when practical.
The hyperparameters are:
\alpha_1, \beta_1
: Parameters of the first component of the marginal distribution of the counts (also the prior distribution)\alpha_2, \beta_2
: Parameters of the second componentP
: Mixture fraction
This function will not need to be called directly if using
exploreHypers
or autoHyper
.
Value
A scalar negative log-likelihood value
Warnings
Make sure N_star matches the smallest actual count in ni before using this function. Filter ni, ei, and wi if needed.
Make sure the data were actually squashed (see squashData
)
before using this function.
References
DuMouchel W, Pregibon D (2001). "Empirical Bayes Screening for Multi-item Associations." In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '01, pp. 67-76. ACM, New York, NY, USA. ISBN 1-58113-391-X.
See Also
nlm
, nlminb
, and
optim
for optimization and squashData
for data squashing
Other negative log-likelihood functions:
negLLzeroSquash()
,
negLLzero()
,
negLL()
Examples
data.table::setDTthreads(2) #only needed for CRAN checks
theta_init <- c(1, 1, 3, 3, .2) #initial guess
data(caers)
proc <- processRaw(caers)
squashed <- squashData(proc, bin_size = 300, keep_pts = 10)
squashed <- squashData(squashed, count = 2, bin_size = 13, keep_pts = 10)
negLLsquash(theta = theta_init, ni = squashed$N, ei = squashed$E,
wi = squashed$weight)
#For hyperparameter estimation...
stats::nlminb(start = theta_init, objective = negLLsquash, ni = squashed$N,
ei = squashed$E, wi = squashed$weight)