Set of CONTROL {pmclust}R Documentation

A Set of Controls in Model-Based Clustering.

Description

This set of controls are used to guide all algorithms implemented in this package.

Format

A list variable contains several parameters for computing.

Details

.PMC.CT stores all default controls for pmclust and pkmeans including

algorithm algorithms implemented
algorithm.gbd algorithms implemented for gbd/spmd
method.own.X how X is distributed
CONTROL a CONTROL list as in next

The elements of CONTROL or .pmclustEnv$CONTROL are

max.iter maximum number of iterations (1000)
abs.err absolute error for convergence (1e-4)
rel.err relative error for convergence (1e-6)
debug debugging flag (0)
RndEM.iter number of RndEM iterations (10)
exp.min minimum exponent (log(.Machine$double.xmin))
exp.max maximum exponent (log(.Machine$double.xmax))
U.min minimum of diagonal of chol
U.max maximum of diagonal of chol
stop.at.fail stop iterations when fails such as NaN

These elements govern the computing including number of iterations, convergent criteria, ill conditions, and numerical issues. Some of them are machine dependent.

Currently, the algorithm could be em, aecm, apecm, apecma, and kmeans for GBD. The method.own.X could be gbdr, common, and single.

Numerical Issues

For example, exp.min and exp.max will control the range of densities function before taking logarithm. If the density values were no in the range, they would be rescaled. The scaling factor will be also recorded for post adjustment for observed data log likelihood. This will provide more accurate posterior probabilities and observed data log likelihood.

Also, U.min and U.max will control the output of chol when decomposing SIGMA in every E-steps. If the diagonal terms were out of the range, a PARAM$U.check would be set to FALSE. Only the components with TRUE U.check will estimate and update the dispersions in M-steps for the rest of iterations.

These problems may cause wrong posteriors and log likelihood due to the degenerate and inflated components. Usually, this is a sign of overestimate the number of components K, or the initialization do not provide good estimations for parameters. See e.step for more information about computing.

Author(s)

Wei-Chen Chen wccsnow@gmail.com and George Ostrouchov.

References

Programming with Big Data in R Website: https://pbdr.org/

See Also

set.global.gbd, and set.global.

Examples

## Not run: 
# Use set.global() to generate one of this.
# X.spmd should be pre-specified before calling set.global().

## End(Not run)

[Package pmclust version 0.2-1 Index]