get_nbCluster_range {Infusion} | R Documentation |
Control of number of components in Gaussian mixture modelling
Description
These functions implement the default values for the number of components tried in Gaussian mixture modelling (matching the nbCluster
argument of Rmixmod::mixmodCluster()
). get_nbCluster_range
allows the user to reproduce the internal rules used by Infusion to determine this argument. seq_nbCluster
is a wrapper to the function defined by the seq_nbCluster
global option of the package. Its default result is a sequence of integers determined by the number of rows of the data (see Infusion.options
). get_nbCluster_range()
further checks the feasibility of the values generated by seq_nbCluster())
, using additional criteria involving the number of columns of the data to determine the maximum feasible number of clusters. This maximum is controlled by the function defined by the maxnbCluster
global option of the package.
refine_nbCluster
controls the default number of clusters of refine
: it gets the range from seq_nbCluster
and keeps only the maximum value of this range if this maximum is higher than the onlymax
argument.
Adventurous users can change the rules used by Infusion by changing the global options seq_nbCluster
and maxnbCluster
(while conforming to the interfaces of these functions). Less ambitiously, they can for example use the maximum value of the result of get_nbCluster_range()
as a single reasonable value for the nbCluster
argument of infer_SLik_joint
.
Usage
seq_nbCluster(nr)
refine_nbCluster(nr, onlymax=7)
get_nbCluster_range(projdata, nr = nrow(projdata), nc = ncol(projdata),
nbCluster = seq_nbCluster(nr))
Arguments
projdata |
data frame: the data to be clustered, which typically include parameters and projected summary statistics; |
nr |
integer: number of rows of the data to be clustered; |
onlymax |
integer: see Description; |
nc |
integer: number of columns of the data to be clustered, typically twice the number of estimated parameters; |
nbCluster |
integer or vector of integers: candidate values, which feasability is checked by the function. |
Value
An integer vector
Examples
# Determination of number of clusters when attempting to estimate
# 20 parameters from a reference table with 30000 rows:
seq_nbCluster(nr=30000L)
get_nbCluster_range(nr=30000L, nc=40L) # nc = *twice* the number of parameters