fitmixture {ForestFit} | R Documentation |
Estimating parameters of the well-known mixture models
Description
Estimates parameters of the mixture model using the expectation maximization (EM) algorithm. General form for the cdf of a statistical mixture model is given by
F(x,{\Theta}) = \sum_{j=1}^{K}\omega_j F_j(x,\theta_j),
where \Theta=(\theta_1,\dots,\theta_K)^T
, is the whole parameter vector, \theta_j
for j=1,\dots,K
is the parameter space of the j
-th component, i.e. \theta_j=(\alpha_j,\beta_j)^{T}
, F_j(.,\theta_j)
is the cdf of the j
-th component, and known constant K
is the number of components. Parameters \alpha
and \beta
are the shape and scale parameters or both are the shape parameters. In the latter case, the parameters \alpha
and \beta
are called the first and second shape parameters, respectively. We note that the constants \omega_j
s sum to one, i.e. \sum_{j=1}^{K}\omega_j=1
. The families considered for the cdf F
include Birnbaum-Saunders, Burr type XII, Chen, F, Frechet, Gamma, Gompertz, Log-normal, Log-logistic, Lomax, skew-normal, and Weibull.
Usage
fitmixture(data, family, K, initial=FALSE, starts)
Arguments
data |
Vector of observations. |
family |
Name of the family including: " |
K |
Number of components. |
initial |
The sequence of initial values including |
starts |
If |
Details
It is worth noting that identifiability of the mixture models supposed to be held. For skew-normal case we have \theta_j=(\alpha_j,\beta_j,\lambda_j)^{T}
in which -\infty<\alpha_j<\infty
, \beta_j>0
, and -\infty<\lambda_j<\infty
, respectively, are the location, scale, and skewness parameters of the j
-th component, see Azzalini (1985).
Value
The output has three parts, The first part includes vector of estimated weight, shape, and scale parameters.
The second part involves a sequence of goodness-of-fit measures consist of Akaike Information Criterion (
AIC
), Consistent Akaike Information Criterion (CAIC
), Bayesian Information Criterion (BIC
), Hannan-Quinn information criterion (HQIC
), Anderson-Darling (AD
), Cram\'eer-von Misses (CVM
), Kolmogorov-Smirnov (KS
), and log-likelihood (log-likelihood
) statistics.The last part of the output contains clustering vector.
Author(s)
Mahdi Teimouri
References
A. Azzalini, 1985. A class of distributions which includes the normal ones, Scandinavian Journal of Statistics, 12, 171-178.
A. P. Dempster, N. M. Laird, and D. B. Rubin, 1977. Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society Series B, 39, 1-38.
M. Teimouri, S. Rezakhah, and A. Mohammdpour, 2018. EM algorithm for symmetric stable mixture model, Communications in Statistics-Simulation and Computation, 47(2), 582-604.
Examples
# Here we model the northern hardwood uneven-age forest data (HW$DIA) in inches using a
# 3-component Weibull mixture distribution.
data(HW)
data<-HW$DIA
K<-3
fitmixture(data,"weibull", K, initial=FALSE)