CluMP {CluMP}R Documentation

Cluster Micro-Panel (longitudinal) Data employing the CluMP algorithm


This function clusters Micro-Panel (longitudinal) Data (or trajectories) to a pre-defined number of clusters by employing Feature-Based Clustering of Micro-Panel (longitudinal) Data algorithm called CluMP (see Reference). Currently, only univariate clustering analysis is available.


CluMP(formula, group, data, cl_numb = NA, base_val = FALSE, method = "ward.D")



A two-sided formula object with a numeric clustering variable (Y) on the left of a ~ separator and the time (numeric) variable on the right. Time is measured from the start of the follow-up period (baseline). Any time units are possible.


A grouping factor variable (vector), i.e. single identifier for each individual (trajectory).


A data frame containing the variables named in the formula and group arguments.


An integer, positive number (scalar) specifying the number of clusters. The OptiNum function can be used to determine the optimal number of clusters according to common evaluation criteria (indices).


Indicates whether include a value at zero time point as an additional clustering variable. Default is FALSE and the standard number (7) of clustering parameters is used.


A method which use in hierarhical clustering, same as in hclust function, namely "ward.D", "ward.D2", "single", "complete", "average", "mcquitty", "median", "centroid". Default is "ward.D".


Cluster Micro-Panel data. The output is the list of 5 components which contain results from clustering.


Sobisek, L., Stachova, M., Fojtik, J. (2018) Novel Feature-Based Clustering of Micro-Panel Data (CluMP). Working paper version online:


data <- GeneratePanel(n = 100, Param = ParamLinear, NbVisit = 10)
CluMP(formula = Y ~ Time, group = "ID", data = data, cl_numb = 3,
base_val = FALSE, method = "ward.D")

CluMP(formula = Y ~ Time, group = "ID", data = data, cl_numb = 3,
base_val = TRUE, method = "ward.D")

[Package CluMP version 0.8.1 Index]