R: Tune the number of clusters according to the partition...

choosenbclust {clusterMI}

R Documentation

Tune the number of clusters according to the partition instability

Description

choosenbclust reports the cluster instability according to the number of clusters chosen.

Usage

choosenbclust(output, grid = 2:5, graph = TRUE, verbose = TRUE, nnodes = NULL)

Arguments

`output`	an output from the clusterMI function
`grid`	a vector indicating the grid of values tested for nb.clust. By default 2:5
`graph`	a boolean indicating if a graphic is plotted
`verbose`	if TRUE, choosenbclust will print messages on console
`nnodes`	number of CPU cores for parallel computing. By default, the value used in the call to the clusterMI function

Details

The choosenbclust function browses a grid of values for the number of clusters and for each one imputes the data and computes the instability.

Value

a list of two objects

`nb.clust`	the number of clusters in `grid` minimizing the instability
`crit`	a vector indicating the instability for each value in the grid

References

Audigier, V. and Niang, N., Clustering with missing data: which equivalent for Rubin's rules? Advances in Data Analysis and Classification <doi:10.1007/s11634-022-00519-1>, 2022.

Examples

data(wine)

require(parallel)
set.seed(123456)
ref <- wine$cult
nb.clust <- 3
wine.na <- wine
wine.na$cult <- NULL
wine.na <- prodna(wine.na)

# imputation
res.imp <- imputedata(data.na=wine.na, nb.clust = nb.clust, m = 5)

# pooling
nnodes <- 2 # number of CPU cores for parallel computing
res.pool <- clusterMI(res.imp, nnodes = nnodes, instability = FALSE)

# choice of nb.clust

choosenbclust(res.pool)

[Package clusterMI version 1.2.1 Index]