Initialization and EM {EMCluster} | R Documentation |

These functions perform initializations (including em.EM and RndEM) followed by the EM iterations for model-based clustering of finite mixture multivariate Gaussian distribution with unstructured dispersion in both of unsupervised and semi-supervised clusterings.

```
init.EM(x, nclass = 1, lab = NULL, EMC = .EMC,
stable.solution = TRUE, min.n = NULL, min.n.iter = 10,
method = c("em.EM", "Rnd.EM"))
em.EM(x, nclass = 1, lab = NULL, EMC = .EMC,
stable.solution = TRUE, min.n = NULL, min.n.iter = 10)
rand.EM(x, nclass = 1, lab = NULL, EMC = .EMC.Rnd,
stable.solution = TRUE, min.n = NULL, min.n.iter = 10)
exhaust.EM(x, nclass = 1, lab = NULL,
EMC = .EMControl(short.iter = 1, short.eps = Inf),
method = c("em.EM", "Rnd.EM"),
stable.solution = TRUE, min.n = NULL, min.n.iter = 10);
```

`x` |
the data matrix, dimension |

`nclass` |
the desired number of clusters, |

`lab` |
labeled data for semi-supervised clustering,
length |

`EMC` |
the control for the EM iterations. |

`stable.solution` |
if returning a stable solution. |

`min.n` |
restriction for a stable solution, the minimum number of observations for every final clusters. |

`min.n.iter` |
restriction for a stable solution, the minimum number of iterations for trying a stable solution. |

`method` |
an initialization method. |

The `init.EM`

calls either `em.EM`

if `method="em.EM"`

or
`rand.EM`

if `method="Rnd.EM"`

.

The `em.EM`

has two steps: short-EM has loose convergent
tolerance controlled by `.EMC$short.eps`

and try several random
initializations controlled by `.EMC$short.iter`

, while long-EM
starts from the best short-EM result (in terms of log likelihood) and
run to convergence with a tight tolerance controlled by `.EMC$em.eps`

.

The `rand.EM`

also has two steps: first randomly pick several
random initializations controlled by `.EMC$short.iter`

, and
second starts from the best of the random result
(in terms of log likelihood) and run to convergence.

The `lab`

is only for the semi-supervised clustering, and it contains
pre-labeled indices between 1 and `K`

for labeled observations.
Observations with index 0 is non-labeled and has to be clustered by
the EM algorithm. Indices will be assigned by the results of the EM
algorithm. See `demo(allinit_ss,'EMCluster')`

for details.

The `exhaust.EM`

also calls the `init.EM`

with different
`EMC`

and perform `exhaust.iter`

times of EM algorithm
with different initials. The best result is returned.

These functions return an object `emobj`

with class `emret`

which can be used in post-process or other functions such as
`e.step`

, `m.step`

, `assign.class`

, `em.ic`

,
and `dmixmvn`

.

Wei-Chen Chen wccsnow@gmail.com and Ranjan Maitra.

https://www.stat.iastate.edu/people/ranjan-maitra

```
## Not run:
library(EMCluster, quietly = TRUE)
set.seed(1234)
x <- da1$da
ret.em <- init.EM(x, nclass = 10, method = "em.EM")
ret.Rnd <- init.EM(x, nclass = 10, method = "Rnd.EM", EMC = .EMC.Rnd)
emobj <- simple.init(x, nclass = 10)
ret.init <- emcluster(x, emobj, assign.class = TRUE)
par(mfrow = c(2, 2))
plotem(ret.em, x)
plotem(ret.Rnd, x)
plotem(ret.init, x)
## End(Not run)
```

[Package *EMCluster* version 0.2-13 Index]