LCVAR {ClusterVAR}R Documentation

Fitting Latent Class VAR Models

Description

Function to fit a Latent Class VAR model with a given number of latent classes.

Usage

LCVAR(Data, yVars, Beep, Day = NULL, ID,
           xContinuous = NULL, xFactor = NULL,
           Clusters, Lags, Center = FALSE,
           smallestClN = 3, Cores = 1,
           RndSeed = NULL, Rand = 50, Rational = TRUE,
           Initialization = NULL, SigmaIncrease = 10,
           it = 50, Conv = 1e-05, pbar = TRUE, verbose = TRUE,
           Covariates = "equal-within-clusters", ...)

Arguments

Data

The data provided in a data.frame.

yVars

An integer vector specifying the position of the column(s) in dataframe Data that contain the endogenous variables (= the VAR time series).

Beep

An integer specifying the position of the column in dataframe Data that contains the time-point.

Day

Optional argument. An integer specifying the position of the column in dataframe Data that contains the variable that indicates the day of measurement. If Day is supplied here, measurements on the previous day are not used to predict measurements on the current day. Instead, the first Lags observations within each day are excluded from the calculation of VAR coefficients.

ID

An integer specifying the position of the column in dataframe Data that contains the ID variable for every participant.

xContinuous

Optional argument. An integer vector specifying the position of the column(s) in dataframe Data that contain the continuous exogenous variable(s), if present. Exogenous variables are also known as covariates or as moderators for the within-person mean.

xFactor

Optional argument. An integer vector specifying the position of the column(s) in dataframe Data that contain the categorical exogenous variable(s), if present. Exogenous variables are also known as covariates or as moderators for the within-person mean.

Clusters

An integer or integer vector specifying the numbers of latent classes (i.e., clusters) for which LCVAR models are to be calculated.

Lags

An integer or integer vector specifying the number of VAR(p) lags to consider. Needs to be a sequence of subsequent integers. The maximum number supported is Lags = 3.

Center

Logical, indicating whether the data (i.e., the endogenous variables) should be centered per person before calculations. If Center = TRUE, the differences in within-person means are removed from the data and the clustering is based only on similarity in VAR coefficients and (if exogenous variable(s) are specified) similarities in infleunces of exogenous variable(s). If Center = FALSE, the clustering is also based on similarities in within-person means. Defaults to Center = FALSE.

smallestClN

An integer specifying the lowest number of individuals allowed in a cluster. When during estimation the crisp cluster membership of a cluster indicates less than smallestClN individuals, the covariance matrix and the posterior probabilities of cluster membership are reset. Defaults to smallestClN = 3.

Cores

A positive integer specifying the number of cores used to parallelize the computations. Specifying a high number of available cores can speed up computation. Defaults to Cores = 1 for non-parallel computing.

RndSeed

Optional argument. An integer specifying the value supplied to set.seed(), which guarantees reproducible results. If not specified, no seed is set.

Rand

The number of pseudo-random EM-starts used in fitting each possible model. For pseudo-random starts K individuals are randomly selected as cluster centres. Then individuals are partitioned into the cluster to which their individual VAR and individual covariate coefficients are closest. High numbers (e.g., 50 and above) ensure that a global optimum will be found, but will take longer to compute. Defaults to Rand = 50.

Rational

Logical, indicating whether a rational EM-start should be used in addition to the other EM-starts. Defaults to Rational = TRUE. Rational starts are based on the k-means partitioning of individuals’ ideographic VAR and ideographic covariate coefficients.

Initialization

Optional argument. An integer specifying the position of a column in dataframe Data that contains a guess at participants' cluster membership for a fixed number of clusters, if available. This initialization will be used as an additional EM-start.

SigmaIncrease

A numerical value specifying the value by which every element of Sigma will be increased when posterior probabilities of cluster memberships are reset. Defaults to SigmaIncrease = 10.

it

An integer specifying the maximum number of EM-iterations allowed for every EM-start. After completing it EM-iterations, an EM-start is forced to terminate. High numbers (e.g., 100 and above) ensure convergence, but will take longer to compute. Defaults to it = 50.

Conv

A numerical value specifying the convergence criterion of the log likelihood to determine convergence of an EM-start. For details see Ernst et al. (2020) Inter-individual differences in multivariate time series: Latent class vector-autoregressive modelling. Defaults to Conv = 1e-05.

pbar

If pbar = TRUE, a progress bar is shown. Defaults to pbar = TRUE.

verbose

If verbose = FALSE, output messages are limited. Additionally, the pbar argument is overridden, so the progress bar is not printed. Defaults to verbose = TRUE.

Covariates

Constraints on the parameters of the exogenous variable(s). So far only Covariates = "equal-within-clusters" can be specified.

...

Additional arguments passed to the function.

Details

This function estimates the latent class vector-autoregressive model to obtain latent classes (i.e., clusters) of individuals who are similar in VAR coefficients and (if specified) in within-person means and infleunces of exogenous variable(s).

y_{i, t} = w_{i, t} + \mu_{k} + \beta_{k} x_{i, t}

w_{i, t} = (\sum_{a = 1}^{p} \Phi_{k, a} w_{i, t-a}) + u_{i, t}\qquad u_{i, t} \sim N(0, \Sigma_{k})

Here \mu_{k} represents an m x 1 vector that contains the cluster-wise conditional within-person mean for each y-variable in cluster k. \beta_{k} represents an m x q matrix that expresses the cluster-wise moderating influence of q exogenous variables (x_{i, t}) on the within-person means in cluster k. \Phi_{k, a} represents an m×m matrix containing the cluster-wise VAR coefficients at lag a for cluster k. See the references below for details.

Value

An object of class 'ClusterVAR' providing several LCVAR models. The details of the output components are as follows:

Call

A list of arguments from the original function call.

All_Models

All LCVAR models across all number of clusters, lag combinations, and number of EM-starts. All_Models[[a]][[b]][[c]] contains all information for the LCVAR model for the ath number of clusters that was specified in Clusters, for the bth combination of lag orders, based on the combination of lags that was specified in Lags, on the cth EM-start. To find the ideal model across all of them use summary(), to view the coefficients of a given model, use coef().

Runtime

The runtime the function took to complete.

Author(s)

Anja Ernst

References

Ernst, A. F., Albers, C. J., Jeronimus, B. F., & Timmerman, M. E. (2020). Inter-individual differences in multivariate time-series: Latent class vector-autoregressive modeling. European Journal of Psychological Assessment, 36(3), 482–491. doi: 10.1027/1015-5759/a000578

Examples




head(SyntheticData)
LCVAR_outExample1 <- LCVAR(Data = SyntheticData,
                           yVars = 1:4, ID = 5, Beep = 9, Day = 10,
                           xContinuous = 7, xFactor = 8,
                           Clusters = 1:2, Lags = 1,
                           Center = TRUE,
                           Cores = 2, # Adapt to local machine
                           RndSeed = 123, Rand = 1, it = 25)
summary(LCVAR_outExample1)
summary(object = LCVAR_outExample1, show = "GNL", Number_of_Lags = 1)
coef(LCVAR_outExample1, Model = c(1, 1))


head(ExampleData)
LCVAR_outExample2 <- LCVAR(Data = ExampleData,
                           yVars = 1:4, ID = 5, Beep = 6,
                           xContinuous = 7, xFactor = 8,
                           Clusters = 1:2, Lags = 1:2,
                           Center = FALSE,
                           Cores = 2, RndSeed = 123,
                           Rand = 1,
                           it = 25, Conv = 1e-05)
summary(LCVAR_outExample2)
summary(object = LCVAR_outExample2, show = "GNL", Number_of_Lags = 1)
summary(object = LCVAR_outExample2, show = "GNC", Number_of_Clusters = 2)
coef(LCVAR_outExample2, Model = c(1, 1))
plot(LCVAR_outExample2, show = "specific", Model = c(1, 1))




[Package ClusterVAR version 0.0.7 Index]