Step1Measures {traj}R Documentation

Compute Measures for Identifying Patterns of Change in Longitudinal Data

Description

Step1Measures computes up to 18 measures for each longitudinal trajectory. See Details for the list of measures.

Usage

Step1Measures(
  Data,
  Time = NULL,
  ID = FALSE,
  measures = c(1:17),
  midpoint = NULL,
  cap.outliers = TRUE
)

## S3 method for class 'trajMeasures'
print(x, ...)

## S3 method for class 'trajMeasures'
summary(object, ...)

Arguments

Data

a matrix or data frame in which each row contains the longitudinal data (trajectories).

Time

either NULL, a vector or a matrix/data frame of the same dimension as Data. If a vector, matrix or data frame is supplied, its entries are assumed to be measured at the times of the corresponding cells in Data. When set to NULL (the default), the times are assumed equidistant.

ID

logical. Set to TRUE if the first columns of Data and Time corresponds to an ID variable identifying the trajectories. Defaults to FALSE.

measures

a vector containing the numerical identifiers of the measures to compute (see "Details" section below). The default, 1:17, corresponds to measures 1-17 and thus excludes the measures which require specifying a midpoint.

midpoint

specifies which column of Time to use as the midpoint in measure 18. Can be NULL, an integer or a vector of integers of length the number of rows in Time. The default is NULL, in which case the midpoint is the time closest to the median of the Time vector specific to each trajectory.

cap.outliers

logical. If TRUE, extreme values of the measures will be caped (see details below). Defaults to TRUE.

x

object of class trajMeasures.

...

further arguments passed to or from other methods.

object

object of class trajMeasures.

Details

Each trajectory must have a minimum of 3 observations otherwise it will be omitted from the analysis.

The 18 measures and their numerical identifiers are listed below. Please refer to the vignette for the specific formulas used to compute them.

  1. Maximum

  2. Range (max - min)

  3. Mean value

  4. Standard deviation

  5. Slope of the linear model

  6. R^2: Proportion of variance explained by the linear model

  7. Curve length (total variation)

  8. Rate of intersection with the mean

  9. Proportion of time spent under the mean

  10. Minimum of the first derivative

  11. Maximum of the first derivative

  12. Mean of the first derivative

  13. Standard deviation of the first derivative

  14. Minimum of the second derivative

  15. Maximum of the second derivative

  16. Mean of the second derivative

  17. Standard deviation of the second derivative

  18. Early change/Later change

In the presence of highly correlated measures (Pearson correlation > 0.98), the function selects the highest-ranking measure on the list (see Step1Measures) and discards the others. Because the K-means algorithm is sensitive to outliers, the measures are prevented from taking extreme or infinite values (caused by a possible division by 0 in m18). Nishiyama's improved Chebychev bound is used to determine extreme values for each measure, corresponding to a 0.3% probability threshold. Extreme values beyond the threshold are then capped to the 0.3% probability threshold. If applicable, the values of m18 which would be of the form 0/0 are set to 1. PCA is applied on the remaining measures using the principal function from the psych package.

Value

An object of class trajMeasures; a list containing the values of the measures, a table of the outliers which have been capped, as well as a curated form of the function's arguments.

References

Leffondre K, Abrahamowicz M, Regeasse A, Hawker GA, Badley EM, McCusker J, Belzile E. Statistical measures were proposed for identifying longitudinal patterns of change in quantitative health indicators. J Clin Epidemiol. 2004 Oct;57(10):1049-62. doi: 10.1016/j.jclinepi.2004.02.012. PMID: 15528056.

Nishiyama T, Improved Chebyshev inequality: new probability bounds with known supremum of PDF, arXiv:1808.10770v2 stat.ME https://doi.org/10.48550/arXiv.1808.10770

Examples

## Not run: 
data("trajdata")
trajdata.noGrp <- trajdata[, which(colnames(trajdata) == "Group")] #remove the Group column

m1 = Step1Measures(trajdata.noGrp, ID = TRUE, measures = 18, midpoint = NULL)
m2 = Step1Measures(trajdata.noGrp, ID = TRUE, measures = 18, midpoint = 3)

identical(m1$measures, m2$measures)

## End(Not run)


[Package traj version 2.1.0 Index]