bdsvd.ht {bdsvd}R Documentation

Hyperparameter Tuning for BD-SVD

Description

Finds the number of non-zero elements of the sparse loading according to the high-dimensional Bayesian information criterion (HBIC).

Usage

bdsvd.ht(X, dof.lim, standardize = TRUE, anp = "2", max.iter)

Arguments

X

Data matrix of dimension n x p with possibly p >> n.

dof.lim

Interval limits for the number of non-zero components in the sparse loading (degrees of freedom). If S denotes the support of v, then the cardinality of the support, |S|, corresponds to the degrees of freedom. Default is dof.lim <- c(0, p-1) which is highly recommended to check for all levels of sparsity.

standardize

Standardize the data to have unit variance. Default is TRUE.

anp

Which regularization function should be used for the HBIC. anp = "1" implements a_{np} = 1 which corresponds to the BIC, anp = "2" implements a_{np} = 1/2 log(np) which corresponds to the regularization used by Bauer (202Xa), and anp = "3" implements a_{np} = log(log(np)) which corresponds to the regularization used by Wang et al. (2009) and Wang et al. (2013).

max.iter

How many iterations should be performed for computing the sparse loading. Default is 200.

Details

The sparse loadings are computed using the method by Shen & Huang (2008), implemented in the irlba package. The computation of the HBIC is outlined in Bauer (202Xa).

Value

dof

The optimal number of nonzero components (degrees of freedom) according to the HBIC.

BIC

The HBIC for the different numbers of nonzero components.

References

Bauer, J.O. (202Xa). High-dimensional block diagonal covariance structure detection using singular vectors.

Shen, H. and Huang, J.Z. (2008). Sparse principal component analysis via regularized low rank matrix approximation, J. Multivar. Anal. 99, 1015–1034.

Wang, H., B. Li, and C. Leng (2009). Shrinkage tuning parameter selection with a diverging number of parameters, J. R. Stat. Soc. B 71 (3), 671–683.

Wang, L., Y. Kim, and R. Li (2013). Calibrating nonconvex penalized regression in ultra-high dimension, Ann. Stat. 41 (5), 2505–2536.

See Also

bdsvd, single.bdsvd

Examples

#Replicate the illustrative example from Bauer (202Xa).


p <- 300 #Number of variables. In Bauer (202Xa), p = 3000
n <- 500 #Number of observations
b <- 3   #Number of blocks
design <- "c"

#Simulate data matrix X
set.seed(1)
Sigma <- bdsvd.cov.sim(p = p, b = b, design = design)
X <- mvtnorm::rmvnorm(n, mean=rep(0, p), sigma=Sigma)
colnames(X) <- seq_len(p)

ht <- bdsvd.ht(X)
plot(0:(p-1), ht$BIC[,1], xlab = "|S|", ylab = "HBIC", main = "", type = "l")
single.bdsvd(X, dof = ht$dof, standardize = FALSE)


[Package bdsvd version 0.2.0 Index]