hcsvd {bdsvd} | R Documentation |
Hierarchical Variable Clustering Using Singular Vectors (HC-SVD).
Description
Performs HC-SVD to reveal the hierarchical variable structure as descried in Bauer (202Xb). For this divise approach, each cluster is split into two clusters iteratively. Potential splits are identified by the first sparse loadings (which are sparse approximations of the first right singular vectors, i.e., vectors with many zero values) that mirror the masked shape of the correlation matrix. This procedure is continued until each variable lies in a single cluster.
Usage
hcsvd(X, k = "all", linkage = "single", reliability, R, max.iter, trace = TRUE)
Arguments
X |
Data matrix of dimension |
k |
Number of sparse loadings to be used. This should be |
linkage |
The linkage function to be used. This should be one of |
reliability |
By default, the value of each cluster equals the distance calculated by the chosen linkage function.
If preferred, the value of each cluster can be assigned by its reliability. When |
R |
Sample correlation matrix of |
max.iter |
How many iterations should be performed for computing the sparse loadings.
Default is |
trace |
Print out progress as |
Details
The sparse loadings are computed using the method by Shen & Huang (2008), implemented in
the irlba
package.
Value
A list with two components:
dist.matrix |
The ultrametric distance matrix (cophenetic matrix) of the HC-SVD structure as an object of class |
u.cor |
The ultrametric correlation matrix of |
k.p |
A vector of length |
References
Bauer, J.O. (202Xb). Hierarchical variable clustering using singular vectors.
Shen, H. and Huang, J.Z. (2008). Sparse principal component analysis via regularized low rank matrix approximation, J. Multivar. Anal. 99, 1015–1034.
Examples
#We replicate the simulation study in Bauer (202Xb)
## Not run:
p <- 100
n <- 300
b <- 5
design <- "a"
Rho <- hcsvd.cor.sim(p = p, b = b, design = "a")
X <- scale(mvtnorm::rmvnorm(300, mean=rep(0,100), sigma=Rho, checkSymmetry = FALSE))
colnames(X) = 1:ncol(X)
hcsvd.obj <- hcsvd(X, k = "Kaiser")
#The dendrogram can be obtained from the ultrametric distance matrix:
plot(hclust(hcsvd.obj$dist.matrix))
## End(Not run)