kml3d {kml3d} | R Documentation |
~ Algorithm kml3d: K-means for Joint Longitidinal data ~
Description
kml3d
is a new implementation of k-means for joint longitudinal
data (or joint trajectories). This algorithm is able to deal with missing value and
provides an easy way to re roll the algorithm several times, varying the starting conditions and/or the number of clusters looked for.
Here is the description of the algorithm. For an overview of the package, see kml3d-package.
Usage
kml3d(object, nbClusters = 2:6, nbRedrawing = 20, toPlot = "none",
parAlgo = parKml3d())
Arguments
object |
[ClusterLongData3d]: contains trajectories to clusterize
and some |
nbClusters |
[vector(numeric)]: Vector containing the number of clusters
with which |
nbRedrawing |
[numeric]: Sets the number of time that k-means must be re-run (with different starting conditions) for each number of clusters. |
toPlot |
|
parAlgo |
|
Details
kml3d
works on object of class ClusterLongData
.
For each number i
included in nbClusters
, kml3d
computes a
Partition
then stores it in the field
cX
of the object ClusterLongData
according to its number
of clusters 'X'.
The algorithm starts over as many times as it is told in nbRedrawing
. By default, it is executed for 2,
3, 4, 5 and 6 clusters 20 times each, namely 100 times.
When a Partition
has been found, it is added to the slot
c1, c2, c3, ... or c26. cX
stores the all Partition
with
X clusters. Inside a sublist, the
Partition
are sorted from the biggest quality criterion to
the smallest (the best are stored first, using
ordered,ListPartition
), or not.
Note that Partition
are saved throughout the algorithm. If the user
interrupts the execution of kml3d
, the result is not lost. If the
user run kml3d
on an object, then running kml3d
again on
the same object will add some new Partition
to the one already
found.
The possible starting conditions are defined in initializePartition
.
Value
A ClusterLongData3d
object, after having added
some Partition
to it.
Optimisation
Behind kml3d
, there are two different procedures :
Fast: when the parameter
distance
is set to "euclidean3d" andtoPlot
is set to 'none' or 'criterion',kml3d
call a C compiled (optimized) procedure.Slow: when the user defines its own distance or if he wants to see the construction of the clusters by setting
toPlot
to 'traj' or 'both',kml3d
uses a R non compiled programmes.
The C prodecure is 25 times faster than the R one.
So we advice to use the R procedure 1/ for trying some new method
(like using a new distance) or 2/ to "see" the very first clusters
construction, in order to check that every thing goes right. Then it
is better to
switch to the C procedure (like we do in Example
section).
If for a specific use, you need a different distance, feel free to contact the author.
See Also
Overview: kml3d-package
Classes : ClusterLongData3d
, Partition
Methods : clusterLongData3d
, choice
Examples
### Move to tempdir
wd <- getwd()
setwd(tempdir()); getwd()
### Generation of some data
cld1 <- generateArtificialLongData3d(15)
### We suspect 2, 3, 4 or 5 clusters, we want 3 redrawing.
### We want to "see" what happen (so toPlot="both")
kml3d(cld1,2:5,3,toPlot="both")
### 3 seems to be the best.
### We don't want to see again, we want to get the result as fast as possible.
### Just, to check the overall process, we plot the criterion evolution
kml3d(cld1,3,10,toPlot="criterion")
### Go back to current dir
setwd(wd)