R: Bayesian Sparse Covariance Estimation using Sure Screening

sbmspcov {bspcov}

R Documentation

Bayesian Sparse Covariance Estimation using Sure Screening

Description

Provides a Bayesian sparse and positive definite estimate of a covariance matrix via screened beta-mixture prior.

Usage

sbmspcov(X, Sigma, cutoff = list(), prior = list(), nsample = list())

Arguments

`X`	a n `\times` p data matrix with column mean zero.
`Sigma`	an initial guess for Sigma.
`cutoff`	a list giving the information for the threshold. The list includes the following parameters (with default values in parentheses): `method ('FNR')` giving the method for determining the threshold value (method='FNR' uses the false negative rate (FNR)-based approach, method='corr' chooses the threshold value by sample correlations), `rho` a lower bound of meaningfully large correlations whose the defaults values are 0.25 and 0.2 for method = 'FNR' and method = 'corr', respectively. Note. If method='corr', `rho` is used as the threshold value. `FNR (0.05)` giving the prespecified FNR level when method = 'FNR'. `nsimdata (1000)` giving the number of simulated datasets for calculating Jeffreys’ default Bayes factors when method = 'FNR'.
`prior`	a list giving the prior information. The list includes the following parameters (with default values in parentheses): `a (1/2)` and `b (1/2)` giving the shape parameters for beta distribution, `lambda (1)` giving the hyperparameter for the diagonal elements, `tau1sq (log(p)/(p^2*n))` giving the hyperparameter for the shrinkage prior of covariance.
`nsample`	a list giving the MCMC parameters. The list includes the following integers (with default values in parentheses): `burnin (1000)` giving the number of MCMC samples in transition period, `nmc (1000)` giving the number of MCMC samples for analysis.

Details

Lee, Jo, Lee, and Lee (2023+) proposed the screened beta-mixture shrinkage prior for estimating a sparse and positive definite covariance matrix. The screened beta-mixture shrinkage prior for \Sigma = (\sigma_{jk}) is defined as

\pi(\Sigma) = \frac{\pi^{u}(\Sigma)I(\Sigma \in C_p)}{\pi^{u}(\Sigma \in C_p)}, ~ C_p = \{\mbox{ all } p \times p \mbox{ positive definite matrices }\},

where \pi^{u}(\cdot) is the unconstrained prior given by

\pi^{u}(\sigma_{jk} \mid \psi_{jk}) = N\left(\sigma_{jk} \mid 0, \frac{\psi_{jk}}{1 - \psi_{jk}}\tau_1^2\right), ~ \psi_{jk} = 1 - 1/(1 + \phi_{jk})

\pi^{u}(\psi_{jk}) = Beta(\psi_{jk} \mid a, b) ~ \mbox{for } (j, k) \in S_r(\hat{R})

\pi^{u}(\sigma_{jj}) = Exp(\sigma_{jj} \mid \lambda),

where S_r(\hat{R}) = \{(j,k) : 1 < j < k \le p, |\hat{\rho}_{jk}| > r\}, \hat{\rho}_{jk} are sample correlations, and r is a threshold given by user.

For more details, see Lee, Jo, Lee and Lee (2022+).

Value

`Sigma`	a nmc `\times` p(p+1)/2 matrix including lower triangular elements of covariance matrix.
`p`	dimension of covariance matrix.
`Phi`	a nmc `\times` p(p+1)/2 matrix including shrinkage parameters corresponding to lower triangular elements of covariance matrix.
`INDzero`	a list including indices of off-diagonal elements screened by sure screening.
`cutoff`	the cutoff value specified by FNR-approach.

Author(s)

Kyoungjae Lee and Seongil Jo

References

Lee, K., Jo, S., Lee, K., and Lee, J. (2023+), "Scalable and optimal Bayesian inference for sparse covariance matrices via screened beta-mixture prior", arXiv:2206.12773.

Examples


set.seed(1)
n <- 20
p <- 5

# generate a sparse covariance matrix:
True.Sigma <- matrix(0, nrow = p, ncol = p)
diag(True.Sigma) <- 1
Values <- -runif(n = p*(p-1)/2, min = 0.2, max = 0.8)
nonzeroIND <- which(rbinom(n=p*(p-1)/2,1,prob=1/p)==1)
zeroIND = (1:(p*(p-1)/2))[-nonzeroIND]
Values[zeroIND] <- 0
True.Sigma[lower.tri(True.Sigma)] <- Values
True.Sigma[upper.tri(True.Sigma)] <- t(True.Sigma)[upper.tri(True.Sigma)]
if(min(eigen(True.Sigma)$values) <= 0){
  delta <- -min(eigen(True.Sigma)$values) + 1.0e-5
  True.Sigma <- True.Sigma + delta*diag(p)
}

# generate a data
X <- MASS::mvrnorm(n = n, mu = rep(0, p), Sigma = True.Sigma)

# compute sparse, positive covariance estimator:
fout <- bspcov::sbmspcov(X = X, Sigma = diag(diag(cov(X))))
post.est.m <- bspcov::estimate(fout)
sqrt(mean((post.est.m - True.Sigma)^2))
sqrt(mean((cov(X) - True.Sigma)^2))

[Package bspcov version 1.0.0 Index]