mixalg {CAMAN} | R Documentation |

## Fitting Finite Mixture Models

### Description

This hybrid mixture algorithm combines the VEM algorithm for flexible support size and the EM algorithm for a fixed number of components. The solution of the VEM algorithm provides starting values for the EM algorithm. By the NPMLE theorem the EM algorithm thus starts very close to the global maximum and proper convergence of the EM algorithm to a global maximum is ensured.

The algorithm proceeds as follows

Step 1: Define an approximating grid `lambda[1], ..., lambda[L]`

Step 2: Use the VEM algorithm to maximize L(P) in the simplex
`\Omega`

and identify grid points with positive support.
Here positive support is defined as `p[j] >= epsilon`

(often `epsilon = 10^-2`

).

This gives an initial estimate of `k`

.

Step 3: Use these `k`

points and corresponding mixing weights
p[j] as starting values for the EM algorithm

Step 4: Collapse identical components if `| lambda[j]- lambda[i] | < delta `

(often delta=0.05) for i != j

Step 5: Obtain the final number of components `k`

This sequential algorithm leads to an initial estimate of the NPMLE
and a proper solution for the subsequent EM algorithm. Crucial points are
the definitions of `\delta`

and `\epsilon`

. Depending on these settings
different solutions could result from this algorithm.

### Usage

```
mixalg(obs, weights=NULL, family="gaussian", data=NULL, pop.at.risk=NULL,
var.lnOR=NULL, limit=0.01, acc=10^(-7), numiter=5000, startk=50)
```

### Arguments

`obs` |
observed / dependent variable. Vector or colname of |

`weights` |
weights of the data. Vector or colname of |

`family` |
the underlying type density function as a character ("gaussian", "poisson" or "binomial")! |

`data` |
an optional data frame. |

`pop.at.risk` |
population at risk: These data could be used to determine a mixture model for Poisson data. Vector or colname of |

`var.lnOR` |
variances of the data: These variances might be given when working with meta analyses! Vector or colname of |

`limit` |
parameter to control the limit of union several components. Default is 0.01. |

`acc` |
convergence criterion. VEM and EM loops stop when deltaLL<acc. Default is 10^(-7). |

`numiter` |
parameter to control the maximal number of iterations in the VEM and EM loops. Default is 5000. |

`startk` |
starting/maximal number of components. This number will be used to compute the grid in the VEM. Default is 50. |

### Details

The documentation of `leukDat`

contains a disease mapping example using `mixalg`

and the documentation of `golubMerge`

contains a microarray analysis example.

### Value

The function returns a CAMAN.object, describing a finite mixture model.
The main information about the mixture model is printed by just typing the <object>. Additional information is given in `summary(object)`

(summary.CAMAN.object).
Single attributes can be accessed using the `@`

, e.g. mix@LL.

`dat` |
(input) data |

`family` |
underlying type density function |

`LL` |
Likelihood of the final (best) iteration |

`BIC` |
Likelihood of the final (best) iteration |

`num.k` |
number of components obtained |

`p` |
probability of each component |

`t` |
parameter of distribution (normal distr. -> mean, poisson distr. -> lambda, binomial distr. -> prob) |

`component.var` |
variance of each component (ONLY if |

`prob` |
probabilies, belonging to each component |

`classification` |
classification labels for each observation ( |

`steps` |
number of steps performed (EM, VEM). |

`VEM_result` |
result of VEM algorithm. |

`cl` |
the matched call. |

`is_metaAnalysis` |
parameter specifying, whether a meta analysis was performed. |

`VEM_result` |
Outcome of the VEM-algorithm, which was run before the EM. |

`finalacc` |
deltaLL of the final iteration (for VEM and EM) |

### Author(s)

Peter Schlattmann and Johannes Hoehne

### References

D. B\"ohning, P. Schlattmann, B.G. Lindsay: C.A.MAN - Computer Assisted
Analysis of Mixtures: Statistical Algorithms.*Biometrics*, 1992, 48, 283-303

P. Schlattmann: On bootstrapping the unknown number of components in
finite mixtures of Poisson distributions. *Statistics and Computing*, 2005, 15, 179-188

Schlattmann, P. (2009). *Medical Applications of Finite Mixture Models.* Berlin: Springer.

### See Also

`mixalg.EM`

, `mixalg.VEM`

, `anova.CAMAN.object`

, `mixcov`

, `mixalg.boot`

### Examples

```
### POISSON data with weights: thai_cohort
data(thai_cohort)
mix <- mixalg(obs="counts", weights="frequency", family="poisson",
data=thai_cohort, numiter=18000, acc=0.00001, startk=25)
# meta analysis
data(aspirin)
mix <- mixalg(obs="logrr", var.lnOR="var", data=aspirin)
## See the documentation of golub.Merge for a
## microarray analysis example using mixalg
## See the documentation of leukDat for a disease
## mapping example using mixalg
```

*CAMAN*version 0.78 Index]