cnm {nspmix} | R Documentation |
Maximum Likelihood Estimation of a Nonparametric Mixture Model
Description
Function cnm
can be used to compute the maximum likelihood estimate
of a nonparametric mixing distribution (NPMLE) that has a one-dimensional
mixing parameter. or simply the mixing proportions with support points held
fixed.
Usage
cnm(
x,
init = NULL,
model = c("npmle", "proportions"),
maxit = 100,
tol = 1e-06,
grid = 100,
plot = c("null", "gradient", "probability"),
verbose = 0
)
Arguments
x |
a data object of some class that is fully defined by the user. The user needs to supply certain functions as described below. |
init |
list of user-provided initial values for the mixing distribution
|
model |
the type of model that is to estimated: the non-parametric MLE
(if |
maxit |
maximum number of iterations. |
tol |
a tolerance value needed to terminate an algorithm. Specifically,
the algorithm is terminated, if the increase of the log-likelihood value
after an iteration is less than |
grid |
number of grid points that are used by the algorithm to locate
all the local maxima of the gradient function. A larger number increases the
chance of locating all local maxima, at the expense of an increased
computational cost. The locations of the grid points are determined by the
function |
plot |
whether a plot is produced at each iteration. Useful for
monitoring the convergence of the algorithm. If |
verbose |
verbosity level for printing intermediate results in each iteration, including none (= 0), the log-likelihood value (= 1), the maximum gradient (= 2), the support points of the mixing distribution (= 3), the mixing proportions (= 4), and if available, the value of the structural parameter beta (= 5). |
Details
A finite mixture model has a density of the form
f(x; \pi, \theta, \beta) = \sum_{j=1}^k \pi_j f(x; \theta_j,
\beta).
where \pi_j \ge 0
and \sum_{j=1}^k \pi_j
=1
.
A nonparametric mixture model has a density of the form
f(x; G) = \int f(x; \theta) d G(\theta),
where G
is a mixing distribution that is
completely unspecified. The maximum likelihood estimate of the nonparametric
G
, or the NPMLE of G
, is known to be a discrete distribution
function.
Function cnm
implements the CNM algorithm that is proposed in Wang
(2007) and the hierarchical CNM algorithm of Wang and Taylor (2013). The
implementation is generic using S3 object-oriented programming, in the sense
that it works for an arbitrary family of mixture models defined by the user.
The user, however, needs to supply the implementations of the following
functions for their self-defined family of mixture models, as they are
needed internally by function cnm
:
initial(x, beta, mix, kmax)
valid(x, beta)
logd(x, beta, pt, which)
gridpoints(x, beta, grid)
suppspace(x, beta)
length(x)
print(x, ...)
weight(x, ...)
While not needed by the algorithm for finding the solution, one may also implement
plot(x, mix, beta, ...)
so that the fitted model can be shown graphically in a user-defined way.
Inside cnm
, it is used when plot="probability"
so that the
convergence of the algorithm can be graphically monitored.
For creating a new class, the user may consult the implementations of these
functions for the families of mixture models included in the package, e.g.,
npnorm
and nppois
.
Value
family |
the name of the mixture family that is used to fit to the data. |
num.iterations |
number of iterations required by the algorithm |
max.gradient |
maximum value of the gradient function, evaluated at the beginning of the final iteration |
convergence |
convergence code. |
ll |
log-likelihood value at convergence |
mix |
MLE of the mixing distribution, being an object of the class
|
beta |
value of the structural parameter, that is held fixed throughout the computation. |
Author(s)
Yong Wang <yongwang@auckland.ac.nz>
References
Wang, Y. (2007). On fast computation of the non-parametric maximum likelihood estimate of a mixing distribution. Journal of the Royal Statistical Society, Ser. B, 69, 185-198.
Wang, Y. (2010). Maximum likelihood computation for fitting semiparametric mixture models. Statistics and Computing, 20, 75-86
Wang, Y. and Taylor, S. M. (2013). Efficient computation of nonparametric survival functions via a hierarchical mixture formulation. Statistics and Computing, 23, 713-725.
See Also
Examples
## Simulated data
x = rnppois(1000, disc(c(1,4), c(0.7,0.3))) # Poisson mixture
(r = cnm(x))
plot(r, x)
x = rnpnorm(1000, disc(c(0,4), c(0.3,0.7)), sd=1) # Normal mixture
plot(cnm(x), x) # sd = 1
plot(cnm(x, init=list(beta=0.5)), x) # sd = 0.5
mix0 = disc(seq(min(x$v),max(x$v), len=100)) # over a finite grid
plot(cnm(x, init=list(beta=0.5, mix=mix0), model="p"),
x, add=TRUE, col="blue") # An approximate NPMLE
## Real-world data
data(thai)
plot(cnm(x <- nppois(thai)), x) # Poisson mixture
data(brca)
plot(cnm(x <- npnorm(brca)), x) # Normal mixture