mable.group {mable} | R Documentation |
Mable fit of one-sample grouped data by an optimal or a preselected model degree
Description
Maximum approximate Bernstein/Beta likelihood estimation based on
one-sample grouped data with an optimal selected by the change-point method among m0:m1
or a preselected model degree m
.
Usage
mable.group(
x,
breaks,
M,
interval = c(0, 1),
IC = c("none", "aic", "hqic", "all"),
vb = 0,
controls = mable.ctrl(),
progress = TRUE
)
Arguments
x |
vector of frequencies |
breaks |
class interval end points |
M |
a positive integer or a vector |
interval |
a vector containing the endpoints of support/truncation interval |
IC |
information criterion(s) in addition to Bayesian information criterion (BIC). Current choices are "aic" (Akaike information criterion) and/or "qhic" (Hannan–Quinn information criterion). |
vb |
code for vanishing boundary constraints, -1: f0(a)=0 only, 1: f0(b)=0 only, 2: both, 0: none (default). |
controls |
Object of class |
progress |
if TRUE a text progressbar is displayed |
Details
Any continuous density function f
on a known closed supporting interval [a, b]
can be
estimated by Bernstein polynomial f_m(x; p) = \sum_{i=0}^m p_i\beta_{mi}[(x-a)/(b-a)]/(b-a)
,
where p = (p_0, \ldots, p_m)
, p_i\ge 0
, \sum_{i=0}^m p_i=1
and
\beta_{mi}(u) = (m+1){m\choose i}u^i(1-x)^{m-i}
, i = 0, 1, \ldots, m
,
is the beta density with shapes (i+1, m-i+1)
.
For each m
, the MABLE of the coefficients p
, the mixture proportions, are
obtained using EM algorithm. The EM iteration for each candidate m
stops if either
the total absolute change of the log likelihood and the coefficients of Bernstein polynomial
is smaller than eps
or the maximum number of iterations maxit
is reached.
If m0<m1
, an optimal model degree is selected as the change-point of the increments of
log-likelihood, log likelihood ratios, for m \in \{m_0, m_0+1, \ldots, m_1\}
. Alternatively,
one can choose an optimal degree based on the BIC (Schwarz, 1978) which are evaluated at
m \in \{m_0, m_0+1, \ldots, m_1\}
. The search for optimal degree m
is stoped if either
m1
is reached with a warning or the test for change-point results in a p-value pval
smaller than sig.level
. The BIC for a given degree m
is calculated as in
Schwarz (1978) where the dimension of the model is d=\#\{i: \hat p_i \ge \epsilon,
i = 0, \ldots, m\} - 1
and a default \epsilon
is chosen as .Machine$double.eps
.
Value
A list with components
-
m
the given or a selected degree by method of change-point -
p
the estimatedp
with degreem
-
mloglik
the maximum log-likelihood at degreem
-
interval
supporting interval(a, b)
-
convergence
An integer code. 0 indicates successful completion (all the EM iterations are convergent and an optimal degree is successfully selected inM
). Possible error codes are1, indicates that the iteration limit
maxit
had been reached in at least one EM iteration;2, the search did not finish before
m1
.
-
delta
the convergence criteriondelta
value
and, if m0<m1
,
-
M
the vector(m0, m1)
, wherem1
, if greater thanm0
, is the largest candidate when the search stoped -
lk
log-likelihoods evaluated atm \in \{m_0, \ldots, m_1\}
-
lr
likelihood ratios for change-points evaluated atm \in \{m_0+1, \ldots, m_1\}
-
ic
a list containing the selected information criterion(s) -
pval
the p-values of the change-point tests for choosing optimal model degree -
chpts
the change-points chosen with the given candidate model degrees
Author(s)
Zhong Guan <zguan@iusb.edu>
References
Guan, Z. (2017) Bernstein polynomial model for grouped continuous data. Journal of Nonparametric Statistics, 29(4):831-848.
See Also
Examples
## Chicken Embryo Data
data(chicken.embryo)
a<-0; b<-21
day<-chicken.embryo$day
nT<-chicken.embryo$nT
Day<-rep(day,nT)
res<-mable.group(x=nT, breaks=a:b, M=c(2,100), interval=c(a, b), IC="aic",
controls=mable.ctrl(sig.level=1e-6, maxit=2000, eps=1.0e-7))
op<-par(mfrow=c(1,2), lwd=2)
layout(rbind(c(1, 2), c(3, 3)))
plot(res, which="likelihood")
plot(res, which="change-point")
fk<-density(x=rep((0:20)+.5, nT), bw="sj", n=101, from=a, to=b)
hist(Day, breaks=seq(a,b, length=12), freq=FALSE, col="grey",
border="white", main="Histogram and Density Estimates")
plot(res, which="density",types=1:2, colors=1:2)
lines(fk, lty=2, col=2)
legend("topright", lty=c(1:2), c("MABLE", "Kernel"), bty="n", col=c(1:2))
par(op)