mult.em_2level {mult.latent.reg}R Documentation

EM algorithm for multivariate two level model with covariates

Description

This function extends the one-level version mult.em_1level, and it is designed to obtain Maximum Likelihood Estimates (MLE) using the EM algorithm for nested (structured) multivariate data, e.g. multivariate test scores (such as on numeracy, literacy) of students nested in different classes or schools. The resulting estimates can be applied for clustering or constructing league tables (ranking of observations). With the inclusion of covariates, the model allows fitting a multivariate response model for further regression analysis. Detailed information about the model used in this function can be found in Zhang et al. (2023).

Arguments

data

A data set object; we denote the dimension to be m.

v

Covariate(s).

K

Number of mixture components, the default is K = 2. Note that when K = 1, z and beta will be 0.

steps

Number of iterations, the default is steps = 20.

start

Containing parameters involved in the proposed model (p, alpha, z, beta, sigma, gamma) in a list, the starting values can be obtained through the use of start_em. More details can be found in start_em.

option

Four options for selecting the starting values for the parameters in the model. The default is option = 1. More details can be found in start_em.

var_fun

There are two types of variance specifications; var_fun = 1, the same diagonal variance specification to all K components of the mixture; var_fun = 2, different diagonal variance matrices for different components; The default is var_fun = 2.

Value

The estimated parameters in the model x_{ij} = \alpha + \beta z_k + \Gamma v_{ij} + \varepsilon_{ij} obtained through the EM algorithm, where the upper-level unit is indexed by i, and the lower-level unit is indexed by j.

p

The estimates for the parameter \pi_k, which is a vector of length K.

alpha

The estimates for the parameter \alpha, which is a vector of length m.

z

The estimates for the parameter z_k, which is a vector of length K.

beta

The estimates for the parameter \beta, which is a vector of length m.

gamma

The estimates for the parameter \Gamma, which is a matrix.

sigma

The estimates for the parameter \Sigma_k. When var_fun = 1, \Sigma_k is a diagonal matrix and \Sigma_k = \Sigma, and we obtain a vector of the diagonal elements; When var_fun = 2, \Sigma_k is a diagonal matrix, and we obtain K vectors of the diagonal elements.

W

The posterior probability matrix.

loglikelihood

The approximated log-likelihood of the fitted model.

disparity

The disparity (-2logL) of the fitted model.

number_parameters

The number of parameters estimated in the EM algorithm.

AIC

The AIC value (-2logL + 2number_parameters).

starting_values

A list of starting values for parameters used in the EM algorithm.

References

Zhang, Y., Einbeck, J. and Drikvandi, R. (2023). A multilevel multivariate response model for data with latent structures. In: Proceedings of the 37th International Workshop on Statistical Modelling, pages 343-348. Link on RG: https://www.researchgate.net/publication/375641972_A_multilevel_multivariate_response_model_for_data_with_latent_structures

See Also

mult.reg_2level.

Examples


##examples for data without covariates.
data(trading_data)
set.seed(49)
trade_res <- mult.em_2level(trading_data, K=4, steps = 10, var_fun = 2)

i_1 <- apply(trade_res$W, 1, which.max)
ind_certain <- rep(as.vector(i_1),c(4,5,5,3,5,5,4,4,5,5,5,5,5,5,5,5,5,5,
3,5,5,5,5,4,4,5,5,5,4,5,4,5,5,5,3,5,5,5,5,5,5,4,5,4))
colors <- c("#FF6600","#66BD63", "lightpink","purple")
plot(trading_data[,-3],pch = 1, col = colors[factor(ind_certain)])
legend("topleft", legend=c("Mass point 1", "Mass point 2","Mass point 3","Mass point 4"),
col=c("#FF6600","purple","#66BD63","lightpink"),pch = 1, cex=0.8)

###The Twins data
library(lme4)
set.seed(26)
twins_res <- mult.em_2level(twins_data[,c(1,2,3)],v=twins_data[,c(4,5,6)],
K=2, steps = 20, var_fun = 2)
coeffs <- twins_res$gamma
##Compare to the estimated coefficients obtained using individual two-level models (lmer()).
summary(lmer(SelfTouchCodable ~ Depression + PSS + Anxiety + (1 | id) ,
data=twins_data, REML = TRUE))$coefficients[2,1]


[Package mult.latent.reg version 0.1.7 Index]