carbikeplot {fsdaR} | R Documentation |
Produces the carbike plot to find best relevant clustering solutions obtained by tclustICsol
Description
Takes as input the output of function tclustICsol
(that is a structure containing the best relevant solutions) and produces the
car-bike plot. This plot provides a concise summary of the best relevant solutions.
This plot shows on the horizontal axis the value of c
and on the vertical axis
the value of k
. For each solution we draw a rectangle for the interval of
values for which the solution is best and stable and a horizontal line which departs
from the rectangle for the values of c
in which the solution is only stable.
Finally, for the best value of c
associated to the solution, we show a circle
with two numbers, the first number indicates the ranked solution among those which are
not spurious and the second one the ranked number including the spurious solutions.
This plot has been baptized 'car-bike', because the first best solutions (in general
2 or 3) are generally best and stable for a large number of values of c
and
therefore will have large rectangles. In addition, these solutions are likely to
be stable for additional values of c
and therefore are likely to have horizontal
lines departing from the rectangles (from here the name 'cars'). Finally, local minor
solutions (which are associated with particular values of c
and k
) do not
generally present rectangles or lines and are shown with circles (from here the
name 'bikes').
Usage
carbikeplot(out, SpuriousSolutions = FALSE, trace = FALSE, ...)
Arguments
out |
An S3 object of class |
SpuriousSolutions |
Wheather to include or not spurious solutions. By default spurios solutions are not included into the plot. |
trace |
Whether to print intermediate results. Default is |
... |
potential further arguments passed to lower level functions. |
Author(s)
FSDA team, valentin.todorov@chello.at
References
Cerioli, A., Garcia-Escudero, L.A., Mayo-Iscar, A. and Riani M. (2017). Finding the Number of Groups in Model-Based Clustering via Constrained Likelihoods, Journal of Computational and Graphical Statistics, pp. 404-416, https://doi.org/10.1080/10618600.2017.1390469.
Examples
## Not run:
## Car-bike plot for the geyser data ========================
data(geyser2)
out <- tclustIC(geyser2, whichIC="MIXMIX", plot=FALSE, alpha=0.1)
## Find the best solutions using as Information criterion MIXMIX
print("Best solutions using MIXMIX")
outMIXMIX <- tclustICsol(out, whichIC="MIXMIX", plot=FALSE, NumberOfBestSolutions=6)
print(outMIXMIX$MIXMIXbs)
carbikeplot(outMIXMIX)
## Car-bike plot for the flea data ==========================
data(flea)
Y <- as.matrix(flea[, 1:(ncol(flea)-1)]) # select only the numeric variables
rownames(Y) <- 1:nrow(Y)
head(Y)
out <- tclustIC(Y, whichIC="CLACLA", plot=FALSE, alpha=0.1, nsamp=100)
## Find the best solutions using as Information criterion CLACLA
print("Best solutions using CLACLA")
outCLACLA <- tclustICsol(out,whichIC="CLACLA", plot=FALSE, NumberOfBestSolutions=66)
## Produce the car-bike plot
carbikeplot(outCLACLA)
## End(Not run)