start_em {mult.latent.reg}R Documentation

Starting values for parameters


The starting values for parameters used for the EM algorithm in the functions: mult.em_1level, mult.em_2level, mult.reg_1level and mult.reg_2level.



A data set object; we denote the dimension of a data set to be mm.


Covariate(s); we denote the dimension of it to be rr.


Number of mixture components, the default is K = 2.


Number of iterations. This will only be used when using option = 2 for both the 1-level model and the 2-level model. It should also be used when using option = 3 and option = 4 for the 1-level model, provided var_fun is set to either 3 or 4; the default is steps = 20.


Four options for selecting the starting values for the parameters. The default is option = 1. When option = 1: πk\pi_k = 1K\frac{1}{K}, zkz_k ~ rnorm(KK, mean = 0, sd=1), α\alpha = column means, β\beta = a random row minus alpha, Γ\Gamma = coefficient estimates from separate linear models, Σ\Sigma is diagonal matrix where the diagonals take the value of column standard deviations over KK; when option = 2: use a short run (steps = 5) of the EM function which uses option = 1 with var_fun = 1 and use the estimates as the starting values for all the parameters; when option = 3: the starting value of β\beta is the first principal component, and the starting values for the rest of the parameters are the same as described when option = 1; when option = 4: first, take the scores of the first principal component of the data and perform KK-means, πk\pi_k is the proportion of the clustering assignments, and zkz_k take the values of the KK-means centers, and the starting values for the rest of the parameters are the same as described when option = 1.


The four variance specifications. When var_fun = 1, the same diagonal variance specification to all KK components of the mixture; var_fun = 2, different diagonal variance matrices for different components. var_fun = 3, the same full (unrestricted) variance for all components. var_fun = 4, different full (unrestricted) variance matrices for different components. If unspecified, var_fun = 2. Note that for application propose, in two-level models, var_fun can only take values of 1 or 2.


optional; specifies starting values for πk\pi_k, it is input as a KK-dimensional vector.


optional; specifies starting values for zkz_k, it is input as a KK-dimensional vector.


optional; specifies starting values for β\beta, it is input as an mm-dimensional vector.


optional; specifies starting values for α\alpha, it is input as an mm-dimensional vector.


optional; specifies starting values for Σk\Sigma_k (Σ\Sigma, when var_fun = 1 or var_fun = 3), when var_fun = 1, it is input as an mm-dimensional vector, when var_fun = 2, it is input as a list (of length KK) of mm-dimensional vectors, when var_fun = 3, it is input as an m×mm \times m matrix, when var_fun = 4, it is input as a list (of length KK) of m×mm \times m matrices.


optional; the coefficients for the covariates; specifies starting values for Γ\Gamma, it is input as an m×rm \times r matrix.


The starting values (in a list) for parameters in the models xi=α+βzk+Γvi+εix_{i} = \alpha + \beta z_k + \Gamma v_i + \varepsilon_i (Zhang and Einbeck, 2024) and xij=α+βzk+Γvij+εijx_{ij} = \alpha + \beta z_k + \Gamma v_{ij} + \varepsilon_{ij} (Zhang et al., 2023) used in the four fucntions: mult.em_1level, mult.em_2level, mult.reg_1level and mult.reg_2level.


The starting value for the parameter πk\pi_k, which is a vector of length KK.


The starting value for the parameter α\alpha, which is a vector of length mm.


The starting value for the parameter zkz_k, which is a vector of length KK.


The starting value for the parameter β\beta, which is a vector of length mm.


The starting value for the parameter Γ\Gamma, which is a matrix.


The starting value for the parameter Σk\Sigma_k. When var_fun = 1, Σk\Sigma_k is a diagonal matrix and Σk=Σ\Sigma_k = \Sigma, and we obtain a vector of the diagonal elements; When var_fun = 2, Σk\Sigma_k is a diagonal matrix, and we obtain K vectors of the diagonal elements; When var_fun = 3, Σk\Sigma_k is a full variance-covariance matrix, Σk=Σ\Sigma_k = \Sigma, and we obtain a matrix Σ\Sigma; When var_fun = 4, Σk\Sigma_k is a full variance-covariance matrix, and we obtain K different matrices Σk\Sigma_k.


Zhang, Y., Einbeck, J. and Drikvandi, R. (2023). A multilevel multivariate response model for data with latent structures. In: Proceedings of the 37th International Workshop on Statistical Modelling, pages 343-348. Link on RG:

Zhang, Y. and Einbeck, J. (2024). A Versatile Model for Clustered and Highly Correlated Multivariate Data. J Stat Theory Pract 18(5).doi:10.1007/s42519-023-00357-0


##example for the faithful data.
start <- start_em(faithful, option = 1)

[Package mult.latent.reg version 0.1.7 Index]