R: Clustering via Stochastic Approximation and Gaussian Mixture...

SAGMMFit {SAGMM}

R Documentation

Clustering via Stochastic Approximation and Gaussian Mixture Models (GMM)

Description

Fit a GMM via Stochastic Approximation. See Reference.

Usage

SAGMMFit(X, Y = NULL, Burnin = 5, ngroups = 5, kstart = 10,
  plot = FALSE)

Arguments

`X`	numeric matrix of the data.
`Y`	Group membership (if known). Where groups are integers in 1:ngroups. If provided ngroups can
`Burnin`	Ratio of observations to use as a burn in before algorithm begins.
`ngroups`	Number of mixture components. If Y is provided, and groups is not then is overridden by Y.
`kstart`	number of kmeans starts to initialise.
`plot`	If TRUE generates a plot of the clustering.

Value

A list containing

`Cluster`	The clustering of each observation.
`plot`	A plot of the clustering (if requested).
`l2`	Estimate of Lambda^2
`ARI1`	Adjusted Rand Index 1 - using k-means
`ARI2`	Adjusted Rand Index 2 - using GMM Clusters
`ARI3`	Adjusted Rand Index 3 - using intialiation k-means
`KM`	Initial K-means clustering of the data.
`pi`	The cluster proportions (vector of length ngroups)
`tau`	tau matrix of conditional probabilities.
`fit`	Full output details from inner C++ loop.

Author(s)

Andrew T. Jones and Hien D. Nguyen

References

Nguyen & Jones (2018). Big Data-Appropriate Clustering via Stochastic Approximation and Gaussian Mixture Models. In Data Analytics (pp. 79-96). CRC Press.

Examples

sims<-generateSimData(ngroups=10, Dimensions=10, Number=10^4)
res1<-SAGMMFit(sims$X, sims$Y)
res2<-SAGMMFit(sims$X, ngroups=5)

[Package SAGMM version 0.2.4 Index]