R: High-level approaches to longitudinal clustering

latrend-approaches {latrend}

R Documentation

High-level approaches to longitudinal clustering

Description

This page provides high-level guidelines on which methods are applicable to your dataset. Note that this is intended as a quick-start.

Recommended overview and comparison papers:

(Den Teuling et al. 2021): A tutorial and overview on methods for longitudinal clustering.
Den Teuling et al. (2021) compared KmL, MixTVEM, GBTM, GMM, and GCKM.
Twisk and Hoekstra (2012) compared KmL, GCKM, LLCA, GBTM and GMM.
Verboon and Pat-El (2022) compared the kml, traj and lcmm packages in R.
Martin and von Oertzen (2015) compared KmL, LCA, and GMM.

Approaches

Disclaimer: The table below has been adapted from a pre-print of (Den Teuling et al. 2021).

Approach	Strengths	Limitations	Methods
Cross-sectional clustering	Suitable for large datasets — Many available algorithms — Non-parametric cluster trajectory representation	Requires time-aligned complete data — Sensitive to measurement noise	lcMethodKML lcMethodMclustLLPA lcMethodMixtoolsNPRM
Distance-based clustering	Suitable for medium-sized datasets — Many distance metrics — Distance matrix only needs to be computed once	Scales poorly with number of trajectories — No robust cluster trajectory representation — Some distance metrics require aligned observations	lcMethodDtwclust
Feature-based clustering	Suitable for large datasets — Configurable — Features only needs to be computed once — Compact trajectory representation	Generally requires intensive longitudinal data — Sensitive to outliers	lcMethodFeature lcMethodAkmedoids lcMethodLMKM lcMethodGCKM
Model-based clustering	Parametric cluster trajectory — Incorporate (domain) assumptions — Low sample size requirements	Computationally intensive — Scales poorly with number of clusters — Convergence challenges	lcMethodLcmmGBTM lcMethodLcmmGMM lcMethodCrimCV lcMethodFlexmix lcMethodFlexmixGBTM lcMethodFunFEM lcMethodMixAK_GLMM lcMethodMixtoolsGMM lcMethodMixTVEM

It is strongly encouraged to evaluate and compare several candidate methods in order to identify the most suitable method.

References

Den Teuling N, Pauws S, Heuvel Evd (2021). “Clustering of longitudinal data: A tutorial on a variety of approaches.” doi:10.48550/ARXIV.2111.05469, https://arxiv.org/abs/2111.05469.

Den Teuling NGP, Pauws SC, van den Heuvel ER (2021). “A comparison of methods for clustering longitudinal data with slowly changing trends.” Communications in Statistics - Simulation and Computation. doi:10.1080/03610918.2020.1861464.

Martin DP, von Oertzen T (2015). “Growth mixture models outperform simpler clustering algorithms when detecting longitudinal heterogeneity, even with small sample sizes.” Struct. Equ. Model., 22(2), 264–275. ISSN 1070-5511, doi:10.1080/10705511.2014.936340.

Twisk J, Hoekstra T (2012). “Classifying developmental trajectories over time should be done with great caution: A comparison between methods.” Journal of Clinical Epidemiology, 65(10), 1078–1087. ISSN 0895-4356, doi:10.1016/j.jclinepi.2012.04.010.

Verboon P, Pat-El R (2022). “Clustering Longitudinal Data Using R: A Monte Carlo Study.” Methodology, 18(2), 144-163. doi:10.5964/meth.7143.

High-level approaches to longitudinal clustering

Description

Approaches

References

See Also