latrend-approaches {latrend} | R Documentation |
High-level approaches to longitudinal clustering
Description
This page provides high-level guidelines on which methods are applicable to your dataset. Note that this is intended as a quick-start.
Recommended overview and comparison papers:
-
(Den Teuling et al. 2021): A tutorial and overview on methods for longitudinal clustering.
-
Den Teuling et al. (2021) compared KmL, MixTVEM, GBTM, GMM, and GCKM.
-
Twisk and Hoekstra (2012) compared KmL, GCKM, LLCA, GBTM and GMM.
-
Verboon and Pat-El (2022) compared the kml, traj and lcmm packages in R.
-
Martin and von Oertzen (2015) compared KmL, LCA, and GMM.
Approaches
Disclaimer: The table below has been adapted from a pre-print of (Den Teuling et al. 2021).
Approach | Strengths | Limitations | Methods |
Cross-sectional clustering | Suitable for large datasets — Many available algorithms — Non-parametric cluster trajectory representation | Requires time-aligned complete data — Sensitive to measurement noise | lcMethodKML lcMethodMclustLLPA lcMethodMixtoolsNPRM |
Distance-based clustering | Suitable for medium-sized datasets — Many distance metrics — Distance matrix only needs to be computed once | Scales poorly with number of trajectories — No robust cluster trajectory representation — Some distance metrics require aligned observations | lcMethodDtwclust |
Feature-based clustering | Suitable for large datasets — Configurable — Features only needs to be computed once — Compact trajectory representation | Generally requires intensive longitudinal data — Sensitive to outliers | lcMethodFeature lcMethodAkmedoids lcMethodLMKM lcMethodGCKM |
Model-based clustering | Parametric cluster trajectory — Incorporate (domain) assumptions — Low sample size requirements | Computationally intensive — Scales poorly with number of clusters — Convergence challenges | lcMethodLcmmGBTM lcMethodLcmmGMM lcMethodCrimCV lcMethodFlexmix lcMethodFlexmixGBTM lcMethodFunFEM lcMethodMixAK_GLMM lcMethodMixtoolsGMM lcMethodMixTVEM |
It is strongly encouraged to evaluate and compare several candidate methods in order to identify the most suitable method.
References
Den Teuling N, Pauws S, Heuvel Evd (2021).
“Clustering of longitudinal data: A tutorial on a variety of approaches.”
doi:10.48550/ARXIV.2111.05469, https://arxiv.org/abs/2111.05469.
Den Teuling NGP, Pauws SC, van den Heuvel ER (2021).
“A comparison of methods for clustering longitudinal data with slowly changing trends.”
Communications in Statistics - Simulation and Computation.
doi:10.1080/03610918.2020.1861464.
Martin DP, von Oertzen T (2015).
“Growth mixture models outperform simpler clustering algorithms when detecting longitudinal heterogeneity, even with small sample sizes.”
Struct. Equ. Model., 22(2), 264–275.
ISSN 1070-5511, doi:10.1080/10705511.2014.936340.
Twisk J, Hoekstra T (2012).
“Classifying developmental trajectories over time should be done with great caution: A comparison between methods.”
Journal of Clinical Epidemiology, 65(10), 1078–1087.
ISSN 0895-4356, doi:10.1016/j.jclinepi.2012.04.010.
Verboon P, Pat-El R (2022).
“Clustering Longitudinal Data Using R: A Monte Carlo Study.”
Methodology, 18(2), 144-163.
doi:10.5964/meth.7143.
See Also
latrend-methods latrend-estimation latrend-metrics