ChooseK {MGMM} | R Documentation |
Cluster Number Selection
Description
Function to choose the number of clusters k. Examines cluster numbers between
k0 and k1. For each cluster number, generates boot
bootstrap data
sets, fits the Gaussian Mixture Model (FitGMM
), and calculates
quality metrics (ClustQual
). For each metric, determines the
optimal cluster number k_opt
, and the k_1SE
, the smallest
cluster number whose quality is within 1 SE of the optimum.
Usage
ChooseK(
data,
k0 = 2,
k1 = NULL,
boot = 100,
init_means = NULL,
fix_means = FALSE,
init_covs = NULL,
init_props = NULL,
maxit = 10,
eps = 1e-04,
report = TRUE
)
Arguments
data |
Numeric data matrix. |
k0 |
Minimum number of clusters. |
k1 |
Maximum number of clusters. |
boot |
Bootstrap replicates. |
init_means |
Optional list of initial mean vectors. |
fix_means |
Fix the means to their starting value? Must provide initial values. |
init_covs |
Optional list of initial covariance matrices. |
init_props |
Optional vector of initial cluster proportions. |
maxit |
Maximum number of EM iterations. |
eps |
Minimum acceptable increment in the EM objective. |
report |
Report bootstrap progress? |
Value
List containing Choices
, the recommended number of clusters
according to each quality metric, and Results
, the mean and standard
error of the quality metrics at each cluster number evaluated.
See Also
See ClustQual
for evaluating cluster quality, and FitGMM
for estimating the GMM with a specified cluster number.
Examples
set.seed(100)
mean_list <- list(c(2, 2), c(2, -2), c(-2, 2), c(-2, -2))
data <- rGMM(n = 500, d = 2, k = 4, means = mean_list)
choose_k <- ChooseK(data, k0 = 2, k1 = 6, boot = 10)
choose_k$Choices