Step3Clusters {traj} | R Documentation |
Classify the Longitudinal Data Based on the Selected Measures.
Description
Classifies the trajectories by applying the k-means clustering
algorithm to the measures selected by Step2Selection
.
Usage
Step3Clusters(
trajSelection,
algorithm = "k-medoids",
metric = "euclidean",
nstart = 200,
iter.max = 100,
nclusters = NULL,
criterion = "Calinski-Harabasz",
K.max = min(15, nrow(trajSelection$selection) - 1),
boot = FALSE,
R = 100,
B = 500
)
## S3 method for class 'trajClusters'
print(x, ...)
## S3 method for class 'trajClusters'
summary(object, ...)
Arguments
trajSelection |
object of class |
algorithm |
either |
metric |
to be passed to the |
nstart |
to be passed to the |
iter.max |
to be passed to the |
nclusters |
either |
criterion |
criterion to determine the optimal number of clusters if |
K.max |
maximum number of clusters to be considered if |
boot |
logical. If |
R |
the number of bootstrap replicate if |
B |
to be passed to the |
x |
object of class |
... |
further arguments passed to or from other methods. |
object |
object of class |
Details
If "GAP"
is the chosen criterion
for determining the optimal number of clusters, the method described by Tibshirani et al. is implemented by the clusGap
function.
Instead, if "Calinski-Harabasz"
is the chosen criterion
, the Calinski-Harabasz index is computed for each possible number of clusters between 2 and K.max
and the optimal number of clusters is the maximizer of the Calinski-Harabasz index. Moreover, if boot
is set to TRUE
, then, following the guidelines suggested by Mesidor et al., a sampling distribution of the optimal number of clusters is obtained by bootstrap and the optimal number of clusters is chosen to be the (first) mode of this sampling distribution.
Value
An object of class trajClusters
; a list containing the result
of the clustering, as well as a curated form of the arguments.
References
Miceline Mésidor, Caroline Sirois, Marc Simard, Denis Talbot, A Bootstrap Approach for Evaluating Uncertainty in the Number of Groups Identified by Latent Class Growth Models, American Journal of Epidemiology, Volume 192, Issue 11, November 2023, Pages 1896–1903, https://doi.org/10.1093/aje/kwad148
Tibshirani, R., Walther, G. and Hastie, T. (2001). Estimating the number of data clusters via the Gap statistic. Journal of the Royal Statistical Society B, 63, 411–423.
Tibshirani, R., Walther, G. and Hastie, T. (2000). Estimating the number of clusters in a dataset via the Gap statistic. Technical Report. Stanford.
See Also
Examples
## Not run:
data("trajdata")
trajdata.noGrp <- trajdata[, -which(colnames(trajdata) == "Group")] #remove the Group column
m = Step1Measures(trajdata.noGrp, ID = TRUE, measures = 1:18)
s = Step2Selection(m)
s$RC$loadings
s2 = Step2Selection(m, select = c(10, 12, 8, 4))
c3.part <- Step3Clusters(s2, nclusters = 3)$partition
c4.part <- Step3Clusters(s2, nclusters = 4)$partition
c5.part <- Step3Clusters(s2, nclusters = 5)$partition
## End(Not run)