tclustIC {fsdaR} | R Documentation |
Performs cluster analysis by calling tclustfsda
for different
number of groups k
and restriction factors c
Description
Computes the values of BIC (MIXMIX), ICL (MIXCLA) or CLA (CLACLA),
for different values of k
(number of groups) and different values of c
(restriction factor), for a prespecified level of trimming (the last two letters in the name
stand for 'Information Criterion'). In order to minimize
randomness, given k
, the same subsets are used for each value of c
.
Usage
tclustIC(
x,
kk = 1:5,
cc = c(1, 2, 4, 8, 16, 32, 64, 128),
alpha = 0,
whichIC = c("ALL", "MIXMIX", "MIXCLA", "CLACLA"),
nsamp,
refsteps = 15,
reftol = 1e-14,
equalweights = FALSE,
msg = TRUE,
nocheck = FALSE,
plot = FALSE,
startv1 = 1,
restrtype = c("eigen", "deter"),
UnitsSameGroup,
numpool,
cleanpool,
trace = FALSE,
...
)
Arguments
x |
An n x p data matrix (n observations and p variables). Rows of x represent observations, and columns represent variables. Missing values (NA's) and infinite values (Inf's) are allowed, since observations (rows) with missing or infinite values will automatically be excluded from the computations. |
kk |
an integer vector specifying the number of mixture components (clusters) for which the BIC is to be calculated. By default |
cc |
an vector specifying the values of the restriction factor which have to be considered. By default |
alpha |
Global trimming level. A scalar between 0 and 0.5 or an integer specifying the number of
observations which have to be trimmed. If More in detail, if |
whichIC |
A character value which specifies which information criteria must be computed
for each
|
nsamp |
If a scalar, it contains the number of subsamples which will be extracted.
If If
REMARK: If |
refsteps |
Number of refining iterations in each subsample. Default is |
reftol |
Tolerance of the refining steps. The default value is 1e-14 |
equalweights |
A logical specifying wheather cluster weights in the concentration
and assignment steps shall be considered. If |
msg |
Controls whether to display or not messages on the screen If |
nocheck |
Check input arguments. If |
plot |
If |
startv1 |
How to initialize centroids and covariance matrices. Scalar.
If Remark 1: in order to start with a routine which is in the required parameter space, eigenvalue restrictions are immediately applied. Remark 2 - option |
restrtype |
Type of restriction to be applied on the cluster scatter matrices.
Valid values are |
UnitsSameGroup |
List of the units which must (whenever possible) have
a particular label. For example |
numpool |
The number of parallel sessions to open. If numpool is not defined, then it is set equal to the number of physical cores in the computer. |
cleanpool |
Logical, indicating if the open pool must be closed or not. It is useful to leave it open if there are subsequent parallel sessions to execute, so that to save the time required to open a new pool. |
trace |
Whether to print intermediate results. Default is |
... |
potential further arguments passed to lower level functions. |
Value
An S3 object of class tclustic.object
Author(s)
FSDA team, valentin.todorov@chello.at
References
Cerioli, A., Garcia-Escudero, L.A., Mayo-Iscar, A. and Riani M. (2017). Finding the Number of Groups in Model-Based Clustering via Constrained Likelihoods, Journal of Computational and Graphical Statistics, pp. 404-416, https://doi.org/10.1080/10618600.2017.1390469.
See Also
tclustfsda
, tclustICplot
, tclustICsol
, carbikeplot
Examples
## Not run:
data(geyser2)
(out <- tclustIC(geyser2, whichIC="MIXMIX", plot=FALSE, alpha=0.1))
summary(out)
## End(Not run)
## Not run:
data(flea)
Y <- as.matrix(flea[, 1:(ncol(flea)-1)]) # select only the numeric variables
rownames(Y) <- 1:nrow(Y)
head(Y)
(out <- tclustIC(Y, whichIC="CLACLA", plot=FALSE, alpha=0.1, nsamp=100, numpool=1))
summary(out)
## End(Not run)