start_em {mult.latent.reg}R Documentation

Starting values for parameters

Description

The starting values for parameters used for the EM algorithm in the functions: mult.em_1level, mult.em_2level, mult.reg_1level and mult.reg_2level.

Arguments

data

A data set object; we denote the dimension of a data set to be m.

v

Covariate(s); we denote the dimension of it to be r.

K

Number of mixture components, the default is K = 2.

steps

Number of iterations. This will only be used when using option = 2 for both the 1-level model and the 2-level model. It should also be used when using option = 3 and option = 4 for the 1-level model, provided var_fun is set to either 3 or 4; the default is steps = 20.

option

Four options for selecting the starting values for the parameters. The default is option = 1. When option = 1: \pi_k = \frac{1}{K}, z_k ~ rnorm(K, mean = 0, sd=1), \alpha = column means, \beta = a random row minus alpha, \Gamma = coefficient estimates from separate linear models, \Sigma is diagonal matrix where the diagonals take the value of column standard deviations over K; when option = 2: use a short run (steps = 5) of the EM function which uses option = 1 with var_fun = 1 and use the estimates as the starting values for all the parameters; when option = 3: the starting value of \beta is the first principal component, and the starting values for the rest of the parameters are the same as described when option = 1; when option = 4: first, take the scores of the first principal component of the data and perform K-means, \pi_k is the proportion of the clustering assignments, and z_k take the values of the K-means centers, and the starting values for the rest of the parameters are the same as described when option = 1.

var_fun

The four variance specifications. When var_fun = 1, the same diagonal variance specification to all K components of the mixture; var_fun = 2, different diagonal variance matrices for different components. var_fun = 3, the same full (unrestricted) variance for all components. var_fun = 4, different full (unrestricted) variance matrices for different components. If unspecified, var_fun = 2. Note that for application propose, in two-level models, var_fun can only take values of 1 or 2.

p

optional; specifies starting values for \pi_k, it is input as a K-dimensional vector.

z

optional; specifies starting values for z_k, it is input as a K-dimensional vector.

beta

optional; specifies starting values for \beta, it is input as an m-dimensional vector.

alpha

optional; specifies starting values for \alpha, it is input as an m-dimensional vector.

sigma

optional; specifies starting values for \Sigma_k (\Sigma, when var_fun = 1 or var_fun = 3), when var_fun = 1, it is input as an m-dimensional vector, when var_fun = 2, it is input as a list (of length K) of m-dimensional vectors, when var_fun = 3, it is input as an m \times m matrix, when var_fun = 4, it is input as a list (of length K) of m \times m matrices.

gamma

optional; the coefficients for the covariates; specifies starting values for \Gamma, it is input as an m \times r matrix.

Value

The starting values (in a list) for parameters in the models x_{i} = \alpha + \beta z_k + \Gamma v_i + \varepsilon_i (Zhang and Einbeck, 2024) and x_{ij} = \alpha + \beta z_k + \Gamma v_{ij} + \varepsilon_{ij} (Zhang et al., 2023) used in the four fucntions: mult.em_1level, mult.em_2level, mult.reg_1level and mult.reg_2level.

p

The starting value for the parameter \pi_k, which is a vector of length K.

alpha

The starting value for the parameter \alpha, which is a vector of length m.

z

The starting value for the parameter z_k, which is a vector of length K.

beta

The starting value for the parameter \beta, which is a vector of length m.

gamma

The starting value for the parameter \Gamma, which is a matrix.

sigma

The starting value for the parameter \Sigma_k. When var_fun = 1, \Sigma_k is a diagonal matrix and \Sigma_k = \Sigma, and we obtain a vector of the diagonal elements; When var_fun = 2, \Sigma_k is a diagonal matrix, and we obtain K vectors of the diagonal elements; When var_fun = 3, \Sigma_k is a full variance-covariance matrix, \Sigma_k = \Sigma, and we obtain a matrix \Sigma; When var_fun = 4, \Sigma_k is a full variance-covariance matrix, and we obtain K different matrices \Sigma_k.

References

Zhang, Y., Einbeck, J. and Drikvandi, R. (2023). A multilevel multivariate response model for data with latent structures. In: Proceedings of the 37th International Workshop on Statistical Modelling, pages 343-348. Link on RG: https://www.researchgate.net/publication/375641972_A_multilevel_multivariate_response_model_for_data_with_latent_structures.

Zhang, Y. and Einbeck, J. (2024). A Versatile Model for Clustered and Highly Correlated Multivariate Data. J Stat Theory Pract 18(5).doi:10.1007/s42519-023-00357-0

Examples

##example for the faithful data.
data(faithful)
start <- start_em(faithful, option = 1)

[Package mult.latent.reg version 0.1.7 Index]