R: Select the number of clusters 'K' in DEEM

tune_K {TensorClustering}

R Documentation

Select the number of clusters `K` in DEEM

Description

Select the number of clusters K along with tuning parameter lambda through BIC in DEEM.

Usage

tune_K(X, seqK, seqlamb, initial = TRUE, vec_x = NULL)

Arguments

`X`	Input tensor (or matrix) list of length `n`, where `n` is the number of observations. Each element of the list is a tensor or matrix. The order of tensor can be any positive integer not less than 2.
`seqK`	A sequence of user-specified number of clusters.
`seqlamb`	A sequence of user-specified `lambda` values. `lambda` is the weight of L1 penalty and a smaller `lambda` allows more variables to be nonzero
`initial`	Whether to initialize algorithm with K-means clustering. Default value is `TRUE`.
`vec_x`	Vectorized tensor data. Default value is `NULL`

Details

The tune_K function runs tune_lamb function length(seqK) times to choose the tuning parameter \lambda and number of clusters K simultaneously. Let \widehat{\bm{\theta}}^{\{\lambda,K\}} be the output of DEEM with the tuning parameter and number of clusters fixed at \lambda and K respectively, tune_K looks for the values of \lambda and K that minimizes

\mathrm{BIC}(\lambda,K)=-2\sum_{i=1}^n\log(\sum_{k=1}^K\widehat{\pi}^{\{\lambda,K\}}_kf_k(\mathbf{X}_i;\widehat{\bm{\theta}}_k^{\{\lambda,K\}}))+\log(n)\cdot |\widehat{\mathcal{D}}^{\{\lambda,K\}}|,

where \widehat{\mathcal{D}}^{\{\lambda,K\}}=\{(k, {\mathcal{J}}): \widehat b_{k,{\mathcal{J}}}^{\lambda} \neq 0 \} is the set of nonzero elements in \widehat{\bm{B}}_2^{\{\lambda,K\}},\ldots,\widehat{\bm{B}}_K^{\{\lambda,K\}}. The tune_K function intrinsically selects the initial point and return the optimal estimated labels.

Value

`opt_K`	Selected number of clusters that leads to optimal BIC.
`opt_lamb`	Tuned `lambda` that leads to optimal BIC.
`Krank`	A selection summary.

Author(s)

Kai Deng, Yuqing Pan, Xin Zhang and Qing Mai

References

Mai, Q., Zhang, X., Pan, Y. and Deng, K. (2021). A Doubly-Enhanced EM Algorithm for Model-Based Tensor Clustering. Journal of the American Statistical Association.

Examples


dimen = c(5,5,5)
nvars = prod(dimen)
K = 2
n = 100
sigma = array(list(),3)

sigma[[1]] = sigma[[2]] = sigma[[3]] = diag(5)

B2=array(0,dim=dimen)
B2[1:3,1,1]=2

y = c(rep(1,50),rep(2,50))
M = array(list(),K)
M[[1]] = array(0,dim=dimen)
M[[2]] = B2

vec_x=matrix(rnorm(n*prod(dimen)),ncol=n)
X=array(list(),n)
for (i in 1:n){
  X[[i]] = array(vec_x[,i],dim=dimen)
  X[[i]] = M[[y[i]]] + X[[i]]
}

mytune = tune_K(X, seqK=2:4, seqlamb=seq(0.01,0.1,by=0.01))

[Package TensorClustering version 1.0.2 Index]

Select the number of clusters K in DEEM