R: Fit the Tensor Envelope Mixture Model (TEMM)

TEMM {TensorClustering}

R Documentation

Fit the Tensor Envelope Mixture Model (TEMM)

Description

Fit the Tensor Envelope Mixture Model (TEMM)

Usage

TEMM(Xn, u, K, initial = "kmeans", iter.max = 500, 
stop = 1e-3, trueY = NULL, print = FALSE)

Arguments

`Xn`	The tensor for clustering, should be array type, the last dimension is the sample size `n`.
`u`	A vector of envelope dimension
`K`	Number of clusters, greater than or equal to `2`.
`initial`	Initialization meth0d for the regularized EM algorithm. Default value is "kmeans".
`iter.max`	Maximum number of iterations. Default value is `500`.
`stop`	Convergence threshold of relative change in cluster means. Default value is `1e-3`.
`trueY`	A vector of true cluster labels of each observation. Default value is NULL.
`print`	Whether to print information including current iteration number, relative change in cluster means and clustering error (`%`) in each iteration.

Details

The TEMM function fits the Tensor Envelope Mixture Model (TEMM) through a subspace-regularized EM algorithm. For mode m, let (\bm{\Gamma}_m,\bm{\Gamma}_{0m})\in R^{p_m\times p_m} be an orthogonal matrix where \bm{\Gamma}_{m}\in R^{p_{m}\times u_{m}}, u_{m}\leq p_{m}, represents the material part. Specifically, the material part \mathbf{X}_{\star,m}=\mathbf{X}\times_{m}\bm{\Gamma}_{m}^{T} follows a tensor normal mixture distribution, while the immaterial part \mathbf{X}_{\circ,m}=\mathbf{X}\times_{m}\bm{\Gamma}_{0m}^{T} is unimodal, independent of the material part and hence can be eliminated without loss of clustering information. Dimension reduction is achieved by focusing on the material part \mathbf{X}_{\star,m}=\mathbf{X}\times_{m}\bm{\Gamma}_{m}^{T}. Collectively, the joint reduction from each mode is

\mathbf{X}_{\star}=[\![\mathbf{X};\bm{\Gamma}_{1}^{T},\dots,\bm{\Gamma}_{M}^{T}]\!]\sim\sum_{k=1}^{K}\pi_{k}\mathrm{TN}(\bm{\alpha}_{k};\bm{\Omega}_{1},\dots,\bm{\Omega}_{M}),\quad \mathbf{X}_{\star}\perp\!\!\!\perp\mathbf{X}_{\circ,m},

where \bm{\alpha}_{k}\in R^{u_{1}\times\cdots\times u_{M}} and \bm{\Omega}_m\in R^{u_m\times u_m} are the dimension-reduced clustering parameters and \mathbf{X}_{\circ,m} does not vary with cluster index Y. In the E-step, the membership weights are evaluated as

\widehat{\eta}_{ik}^{(s)}=\frac{\widehat{\pi}_{k}^{(s-1)}f_{k}(\mathbf{X}_i;\widehat{\bm{\theta}}^{(s-1)})}{\sum_{k=1}^{K}\widehat{\pi}_{k}^{(s-1)}f_{k}(\mathbf{X}_i;\widehat{\bm{\theta}}^{(s-1)})},

where f_k denotes the conditional probability density function of \mathbf{X}_i within the k-th cluster. In the subspace-regularized M-step, the envelope subspace is iteratively estimated through a Grassmann manifold optimization that minimize the following log-likelihood-based objective function:

G_m^{(s)}(\bm{\Gamma}_m) = \log|\bm{\Gamma}_m^T \mathbf{M}_m^{(s)} \bm{\Gamma}_m|+\log|\bm{\Gamma}_m^T (\mathbf{N}_m^{(s)})^{-1} \bm{\Gamma}_m|,

where \mathbf{M}_{m}^{(s)} and \mathbf{N}_{m}^{(s)} are given by

\mathbf{M}_m^{(s)} = \frac{1}{np_{-m}}\sum_{i=1}^{n} \sum_{k=1}^{K}\widehat{\eta}_{ik}^{(s)} (\bm{\epsilon}_{ik}^{(s)})_{(m)}(\widehat{\bm{\Sigma}}_{-m}^{(s-1)})^{-1} (\bm{\epsilon}_{ik}^{(s)})_{(m)}^T,

\mathbf{N}_m^{(s)} = \frac{1}{np_{-m}}\sum_{i=1}^{n} (\mathbf{X}_i)_{(m)}(\widehat{\bm{\Sigma}}_{-m}^{(s-1)})^{-1}(\mathbf{X}_i)_{(m)}^T.

The intermediate estimators \mathbf{M}_{m}^{(s)} can be viewed the mode-m conditional variation estimate of \mathbf{X}\mid Y and \mathbf{N}_{m}^{(s)} is the mode-m marginal variation estimate of \mathbf{X}.

Value

`id`	A vector of estimated labels.
`pi`	A vector of estimated prior probabilities for clusters.
`eta`	A `n` by `K` matrix of estimated membership weights.
`Mu.est`	A list of estimated cluster means.
`SIG.est`	A list of estimated covariance matrices.
`Mm`	Estimation of `Mm` defined in paper.
`Nm`	Estimation of `Nm` defined in paper.
`Gamma.est`	A list of estimated envelope basis.
`PGamma.est`	A list of envelope projection matrices.

Author(s)

Kai Deng, Yuqing Pan, Xin Zhang and Qing Mai

References

Deng, K. and Zhang, X. (2021). Tensor Envelope Mixture Model for Simultaneous Clustering and Multiway Dimension Reduction. Biometrics.

Examples

  A = array(c(rep(1,20),rep(2,20))+rnorm(40),dim=c(2,2,10))
  myfit = TEMM(A,u=c(2,2),K=2)

[Package TensorClustering version 1.0.2 Index]

Fit the Tensor Envelope Mixture Model (TEMM)

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples