R: Estimate Parameters of Mixture Distributions

mix {mixdist}

R Documentation

Estimate Parameters of Mixture Distributions

Description

Find a set of overlapping component distributions that gives the best fit to grouped data and conditional data, using a combination of a Newton-type method and EM algorithm.

Usage

mix(mixdat, mixpar, dist = "norm", constr = list(conpi = "NONE", 
    conmu = "NONE", consigma = "NONE", fixpi = NULL, fixmu = NULL, 
    fixsigma = NULL, cov = NULL, size = NULL), emsteps = 1, 
    usecondit = FALSE, exptol = 5e-06, print.level = 0, ...)

Arguments

`mixdat`	A data frame containing grouped data, whose first column should be right boundaries of grouping intervals where the first and last intervals are open-ended; whose second column should consist of the frequencies indicating numbers of observations falling into each interval. If conditional data are available, this data frame should have k + 2 columns, where k is the number of components, whose element in row j and column i + 2 is the number of observations from the jth interval belonging to the ith component.
`mixpar`	A data frame containing starting values for parameters of component distributions, which are, in order, the proportions, means, and standard deviations.
`dist`	the distribution of components, it can be one of `"norm"`, `"lnorm"`, `"gamma"`, `"weibull"`, `"binom"`, `"nbinom"` and `"pois"`.
`constr`	a list of constraints on parameters of component distributions. See function `mixconstr`.
`emsteps`	a non-negative integer specifying the number of EM steps to be performed.
`usecondit`	logical. If `usecondit` is `TRUE` and `mixdat` includes conditional data, then conditional data will be used with grouped data to estimate parameters of mixtures.
`exptol`	a positive scalar giving the tolerance at which the scaled fitted value is considered large enough to be a degree of freedom.
`print.level`	this argument determines the level of printing which is done during the optimization process. The default value of `0` means that no printing occurs, a value of `1` means that initial and final details are printed and a value of `2` means that full tracing information is printed.
`...`	additional arguments to the optimization function `nlm`

Value

A list containing the following items:

`parameters`	A data frame containing estimated values for parameters of component distributions, which are, in order, the proportions, means, and standard deviations.
`se`	A data frame containing estimated values for standard errors of parameters of component distributions.
`distribution`	the distribution used to fit the data.
`constraint`	the constraints on parameters.
`chisq`	the goodness-of-fit chi-square statistic.
`df`	degrees of freedom of the fitted mixture model.
`P`	a significance level (P-value) for the goodness-of-fit test.
`vmat`	covariance matrix for the estimated parameters.
`mixdata`	the original data, i.e. the argument `mixdat`.
`usecondit`	the value of the argument `usecondit`.

References

Macdonald, P.D.M. and Green, P.E.J. (1988) User's Guide to Program MIX: An Interactive Program for Fitting Mixtures of Distributions. ICHTHUS DATA SYSTEMS.

Examples

data(pike65)
data(pikepar)
fitpike1 <- mix(pike65, pikepar, "lnorm", constr = mixconstr(consigma = "CCV"), emsteps = 3)
fitpike1
plot(fitpike1)
data(pike65sg)
fitpike2 <- mix(pike65sg, pikepar, "lnorm", emsteps = 3, usecondit = TRUE)
fitpike2
plot(fitpike2)
data(bindat)
data(binpar)
fitbin1 <- mix(bindat, binpar, "binom",
               constr = mixconstr(consigma = "BINOM", size = c(20, 20, 20, 20)))
plot(fitbin1)
fitbin2 <- mix(bindat, binpar, "binom", constr = mixconstr(conpi = "PFX",
               fixpi = c(TRUE, TRUE, TRUE, TRUE),
               consigma = "BINOM", size = c(20, 20, 20, 20)))
plot(fitbin2)

[Package mixdist version 0.5-5 Index]