Compute estimates of the parameters by Expectation and Maximization algorithm.

Description

Compute an approximation of the maximum likelihood estimates of parameters using Expectation and Maximization (EM) algorithm. A maximum a posteriori classification is then derived from the estimated set of parameters.

Usage

em.cluster.R(xdata, K, S, ploidy = 1, emOptions = list(epsi = NULL,
typeSmallEM = NULL, typeEM = NULL, nberSmallEM = NULL, nberIterations = NULL,
nberMaxIterations = NULL, putThreshold = NULL), cte = 1)


Arguments

 xdata A matrix of strings with the number of columns equal to ploidy * (number of variables). K The number of clusters (or populations). S The subset of clustering variables in the form of a vector of logicals indicating the selected variables. S gathers variables that are not identically distributed in at least two clusters. ploidy The number of unordered observations represented by a string in xdata. For example, for genotypic data from diploid individual, ploidy = 2. emOptions A list of EM options (see EmOptions and setEmOptions). cte A double used as a value of λ in the penalty function pen(K,S)=λ*dim≤ft(K,S\right), where dim≤ft(K,S\right) is the number of free parameters in the model defined by ≤ft(K,S\right).

Value

A list of

• N : The size (number of lines) of the dataset.

• K : The number of clusters (populations).

• S : A vector of logicals indicating the selected variables for clustering.

• dim : The number of free parameters.

• pi_K : The vector of mixing proportions.

• prob : A list of matrices, each matrix being the probabilities of a variable in different clusters.

• logLik : The log-likelihood.

• entropy : The entropy.

• criteria : Criteria values c(BIC, AIC, ICL, CteDim).

• Tik : A stochastic matrix given the a posteriori membership probabilities.

• mapClassif : Maximum a posteriori classification.

• NbersLevels : The numbers of observed levels of the considered categorical variables.

• levels : The observed levels.

data(genotype1)