mlcc.bic {varclust} | R Documentation |
Multiple Latent Components Clustering - Subspace clustering with automatic estimation of number of clusters and their dimension
Description
This function is an implementation of Multiple Latent Components Clustering
(MLCC) algorithm which clusteres quantitative variables into a number, chosen
using mBIC, of groups. For each considered number of clusters in
numb.clusters mlcc.reps
function is called. It invokes
K-means based algorithm (mlcc.kmeans
) finding local minimum of
mBIC, which is run a given number of times (numb.runs) with different
initializations. The best partition is choosen with mBIC (see
mlcc.reps
function).
Usage
mlcc.bic(X, numb.clusters = 1:10, numb.runs = 30, stop.criterion = 1,
max.iter = 30, max.dim = 4, scale = TRUE, numb.cores = NULL,
greedy = TRUE, estimate.dimensions = TRUE, verbose = FALSE,
flat.prior = FALSE, show.warnings = FALSE)
Arguments
X |
A data frame or a matrix with only continuous variables. |
numb.clusters |
A vector, numbers of clusters to be checked. |
numb.runs |
An integer, number of runs (initializations) of
|
stop.criterion |
An integer, if an iteration of
|
max.iter |
An integer, maximum number of iterations of the loop in
|
max.dim |
An integer, if estimate.dimensions is FALSE then max.dim is dimension of each subspace. If estimate.dimensions is TRUE then subspaces dimensions are estimated from the range [1, max.dim]. |
scale |
A boolean, if TRUE (value set by default) then variables in dataset are scaled to zero mean and unit variance. |
numb.cores |
An integer, number of cores to be used, by default all cores are used. |
greedy |
A boolean, if TRUE (value set by default) the clusters are estimated in a greedy way - first local minimum of mBIC is chosen. |
estimate.dimensions |
A boolean, if TRUE (value set by default) subspaces dimensions are estimated. |
verbose |
A boolean, if TRUE plot with mBIC values for different numbers of clusters is produced and values of mBIC, computed for every number of clusters and subspaces dimensions, are printed (value set by default is FALSE). |
flat.prior |
A boolean, if TRUE then, instead of an informative prior that takes into account number of models for a given number of clusters, flat prior is used. |
show.warnings |
A boolean, if set to TRUE all warnings are displayed, default value is FALSE. |
Value
An object of class mlcc.fit consisting of
segmentation |
a vector containing the partition of the variables |
BIC |
numeric, value of mBIC |
subspacesDimensions |
a list containing dimensions of the subspaces |
nClusters |
an integer, estimated number of clusters |
factors |
a list of matrices, basis for each subspace |
all.fit |
a list of segmentation, mBIC, subspaces dimension for all numbers of clusters considered for an estimated subspace dimensions |
all.fit.dims |
a list of lists of segmentation, mBIC, subspaces dimension for all numbers of clusters and subspaces dimensions considered |
Examples
sim.data <- data.simulation(n = 50, SNR = 1, K = 3, numb.vars = 50, max.dim = 3)
mlcc.res <- mlcc.bic(sim.data$X, numb.clusters = 1:5, numb.runs = 20, numb.cores = 1, verbose=TRUE)
show.clusters(sim.data$X, mlcc.res$segmentation)