DEEM {TensorClustering} | R Documentation |
Doubly-enhanced EM algorithm
Description
Doubly-enhanced EM algorithm for tensor clustering
Usage
DEEM(X, nclass, niter = 100, lambda = NULL, dfmax = n, pmax = nvars, pf = rep(1, nvars),
eps = 1e-04, maxit = 1e+05, sml = 1e-06, verbose = FALSE, ceps = 0.1,
initial = TRUE, vec_x = NULL)
Arguments
X |
Input tensor (or matrix) list of length |
nclass |
Number of clusters. |
niter |
Maximum iteration times for EM algorithm. Default value is 100. |
lambda |
A user-specified |
dfmax |
The maximum number of selected variables in the model. Default is the number of observations |
pmax |
The maximum number of potential selected variables during iteration. In middle step, the algorithm can select at most |
pf |
Weight of lasso penalty. Default is a vector of value |
eps |
Convergence threshold for coordinate descent difference between iterations. Default value is |
maxit |
Maximum iteration times for coordinate descent for all lambda. Default value is |
sml |
Threshold for ratio of loss function change after each iteration to old loss function value. Default value is |
verbose |
Indicates whether print out lambda during iteration or not. Default value is |
ceps |
Convergence threshold for cluster mean difference between iterations. Default value is |
initial |
Whether to initialize algorithm with K-means clustering. Default value is |
vec_x |
Vectorized tensor data. Default value is |
Details
The DEEM
function implements the Doubly-Enhanced EM algorithm (DEEM) for tensor clustering. The observations are assumed to be following the tensor normal mixture model (TNMM) with common covariances across different clusters:
where is the prior probability for
to be in the
-th cluster such that
,
is the cluster mean of the
-th cluster and
are the common covariances across different clusters. Under the TNMM framework, the optimal clustering rule can be showed as
where . In the enhanced E-step,
DEEM
imposes sparsity directly on the optimal clustering rule as a flexible alternative to popular low-rank assumptions on tensor coefficients as
where is a tuning parameter. In the enhanced M-step,
DEEM
employs a new estimator for the tensor correlation structure, which facilitates both the computation and the theoretical studies.
Value
pi |
A vector of estimated prior probabilities for clusters. |
mu |
A list of estimated cluster means. |
sigma |
A list of estimated covariance matrices. |
gamma |
A |
y |
A vector of estimated labels. |
iter |
Number of iterations until convergence. |
df |
Average zero elements in beta over iterations. |
beta |
A matrix of vectorized |
Author(s)
Kai Deng, Yuqing Pan, Xin Zhang and Qing Mai
References
Mai, Q., Zhang, X., Pan, Y. and Deng, K. (2021). A Doubly-Enhanced EM Algorithm for Model-Based Tensor Clustering. Journal of the American Statistical Association.
See Also
Examples
dimen = c(5,5,5)
nvars = prod(dimen)
K = 2
n = 100
sigma = array(list(),3)
sigma[[1]] = sigma[[2]] = sigma[[3]] = diag(5)
B2=array(0,dim=dimen)
B2[1:3,1,1]=2
y = c(rep(1,50),rep(2,50))
M = array(list(),K)
M[[1]] = array(0,dim=dimen)
M[[2]] = B2
vec_x=matrix(rnorm(n*prod(dimen)),ncol=n)
X=array(list(),n)
for (i in 1:n){
X[[i]] = array(vec_x[,i],dim=dimen)
X[[i]] = M[[y[i]]] + X[[i]]
}
myfit = DEEM(X, nclass=2, lambda=0.05)