mult.em_1level {mult.latent.reg} | R Documentation |
EM algorithm for multivariate one level model with covariates
Description
This function is used to obtain the Maximum Likelihood Estimates (MLE) using the EM algorithm for one-level multivariate data. The estimates enable users to conduct clustering, ranking, and simultaneous dimension reduction on the multivariate dataset. Furthermore, when covariates are included, the function supports the fitting of multivariate response models, expanding its utility for regression analysis. The details of the model used in this function can be found in Zhang and Einbeck (2024).
Arguments
data |
A data set object; we denote the dimension to be |
v |
Covariate(s). |
K |
Number of mixture components, the default is |
steps |
Number of iterations, the default is |
start |
Containing parameters involved in the proposed model ( |
option |
Four options for selecting the starting values for the parameters in the model. The default is option = 1. More details can be found in start_em. |
var_fun |
There are four types of variance specifications;
|
Value
The estimated parameters in the model x_{i} = \alpha + \beta z_k + \Gamma v_i + \varepsilon_i
obtained through the EM algorithm at the convergence.
p |
The estimates for the parameter |
alpha |
The estimates for the parameter |
z |
The estimates for the parameter |
beta |
The estimates for the parameter |
gamma |
The estimates for the parameter |
sigma |
The estimates for the parameter |
W |
The posterior probability matrix. |
loglikelihood |
The approximated log-likelihood of the fitted model. |
disparity |
The disparity ( |
number_parameters |
The number of parameters estimated in the EM algorithm. |
AIC |
The AIC value ( |
BIC |
The BIC value ( |
starting_values |
A list of starting values for parameters used in the EM algorithm. |
References
Zhang, Y. and Einbeck, J. (2024). A Versatile Model for Clustered and Highly Correlated Multivariate Data. J Stat Theory Pract 18(5).doi:10.1007/s42519-023-00357-0
See Also
Examples
##example for data without covariates.
data(faithful)
res <- mult.em_1level(faithful,K=2,steps = 10,var_fun = 1)
## Graph showing the estimated one-dimensional space with cluster centers in red and alpha in green.
x <- res$alpha[1]+res$beta[1]*res$z
y <- res$alpha[2]+res$beta[2]*res$z
plot(faithful,col = 8)
points(x=x[1],y=y[1],type = "p",col = "red",pch = 17)
points(x=x[2],y=y[2],type = "p",col = "red",pch = 17)
points(x=res$alpha[1],y=res$alpha[2],type = "p",col = "darkgreen",pch = 4)
slope <- (y[2]-y[1])/(x[2]-x[1])
intercept <- y[1]-slope*x[1]
abline(intercept, slope, col="red")
##Graph showing the originaldata points being assigned to different
##clusters according to the Maximum a posterior (MAP) rule.
index <- apply(res$W, 1, which.max)
faithful_grouped <- cbind(faithful,index)
colors <- c("#FDAE61", "#66BD63")
plot(faithful_grouped[,-3], pch = 1, col = colors[factor(index)])
##example for data with covariates.
data(fetal_covid_data)
set.seed(2)
covid_res <- mult.em_1level(fetal_covid_data[,c(1:5)],v=fetal_covid_data$status_bi, K=3, steps = 20,
var_fun = 2)
coeffs <- covid_res$gamma
##compare with regression coefficients from fitting individual linear models.
summary(lm( UpperFaceMovements ~ status_bi,data=fetal_covid_data))$coefficients[2,1]
summary(lm( Headmovements ~ status_bi,data=fetal_covid_data))$coefficients[2,1]