R: Selecting the best results for multivariate one level model

mult.reg_1level {mult.latent.reg}

R Documentation

Selecting the best results for multivariate one level model

Description

This wrapper function runs multiple times the function mult.em_1level for fitting Zhang and Einbeck's (2024) multivariate response models with one-level random effect, and select the best results with the smallest AIC value.

Arguments

`data`	A data set object; we denote the dimension of a data set to be `m`.
`v`	Covariate(s).
`K`	Number of mixture components, the default is `K = 2`.
`steps`	Number of iterations within each `num_runs`, the default is `steps = 20`.
`num_runs`	Number of function iteration runs, the default is `num_runs = 10`.
`start`	Containing parameters involved in the proposed model (`p`, `alpha`, `z`, `beta`, `sigma`, `gamma`) in a list, the starting values can be obtained through the use of start_em. More details can be found in start_em.
`option`	Four options for selecting the starting values for the parameters in the model. The default is `option = 1`. More details can be found in start_em.
`var_fun`	There are four types of variance specifications; `var_fun = 1`, the same diagonal variance specification to all `K` components of the mixture; `var_fun = 2`, different diagonal variance matrices for different components. `var_fun = 3`, the same full (unrestricted) variance for all components. `var_fun = 4`, different full (unrestricted) variance matrices for different components. The default is `var_fun = 2`.

Value

The best estimated result (with the smallest AIC value) in the model (Zhang and Einbeck, 2024) x_{i} = \alpha + \beta z_k + \Gamma v_i + \varepsilon_i obtained through the EM algorithm.

`p`	The estimates for the parameter `\pi_k`, which is a vector of length `K`.
`alpha`	The estimates for the parameter `\alpha`, which is a vector of length `m`.
`z`	The estimates for the parameter `z_k`, which is a vector of length `K`.
`beta`	The estimates for the parameter `\beta`, which is a vector of length `m`.
`gamma`	The estimates for the parameter `\Gamma`, which is a matrix.
`sigma`	The estimates for the parameter `\Sigma_k`. When `var_fun = 1`, `\Sigma_k` is a diagonal matrix and `\Sigma_k = \Sigma`, and we obtain a vector of the diagonal elements; When `var_fun = 2`, `\Sigma_k` is a diagonal matrix, and we obtain `K` vectors of the diagonal elements; When `var_fun = 3`, `\Sigma_k` is a full variance-covariance matrix, `\Sigma_k = \Sigma`, and we obtain a matrix `\Sigma`; When `var_fun = 4`, `\Sigma_k` is a full variance-covariance matrix, and we obtain `K` different matrices `\Sigma_k`.
`W`	The posterior probability matrix.
`loglikelihood`	The approximated log-likelihood of the fitted model.
`disparity`	The disparity (`-2logL`) of the fitted model.
`number_parameters`	The number of parameters estimated in the EM algorithm.
`AIC`	The AIC value (`-2logL + 2number_parameters`).
`BIC`	The BIC value (`-2logL + number_parameters*log(n)`), where n is the number of observations.
`aic_data`	All AIC values in each run.
`Starting_values`	Lists of starting values for parameters used in each `num_runs`. It allows reproduction of the best result (obtained from mult.reg_1level) in a single run using mult.em_1level by setting `start` equal to the list of starting values that were used to obtain the best result in mult.reg_1level.

References

Zhang, Y. and Einbeck J. (2024). A Versatile Model for Clustered and Highly Correlated Multivariate Data. J Stat Theory Pract 18(5).doi:10.1007/s42519-023-00357-0

Examples


##run the mult.em_1level() multiple times and select the best results with the smallest AIC value
set.seed(7)
results <- mult.reg_1level(fetal_covid_data[,c(1:5)],v=fetal_covid_data$status_bi,
K=3, num_runs = 5,steps = 20, var_fun = 2, option = 1)
##Reproduce the best result: the best result is the 5th run in the above example.
rep_best_result <- mult.em_1level(fetal_covid_data[,c(1:5)],
v=fetal_covid_data$status_bi,
K=3, steps = 20, var_fun = 2, option = 1,
start = results$Starting_values[[5]])