jointMeanCovStability {jointMeanCov} | R Documentation |
Estimate Mean and Correlation Structure Using Stability Selection
Description
Given a data matrix, this function performs stability selection as described in the section "Stability of Gene Sets" in the paper Joint mean and covariance estimation with unreplicated matrix-variate data (2018), M. Hornstein, R. Fan, K. Shedden, and S. Zhou; Journal of the American Statistical Association.
Usage
jointMeanCovStability(X, group.one.indices, rowpen,
n.genes.to.group.center = NULL)
Arguments
X |
Data matrix of size n by m. |
group.one.indices |
Vector of indices denoting rows in group one. |
rowpen |
Glasso penalty for estimating B, the row-row correlation matrix. |
n.genes.to.group.center |
Vector specifying the number of genes to group center on each iteration of the stability selection algorithm. The length of this vector is equal to the number of iterations of stability selection. If this argument is not provided, the default is a decreasing sequence starting with m, followed |
Details
Let m[i]
denote the number of group-centered genes on
the ith iteration of stability selection (where m[i]
is a decreasing sequence).
Estimated group means are initialized using unweighted
sample means. Then, for each iteration of stability
selection:
1. The top m[i]
genes are selected for group centering
by ranking the estimated group differences from the previous
iteration.
2. Group means and global means are estimated using
GLS, using the inverse row covariance matrix from the
previous iteration. The centered data matrix is then
used as input to Gemini to estimate the inverse row covariance
matrix B.hat.inv.
3. Group means are estimated using GLS with B.hat.inv.
Value
n.genes.to.group.center |
Number of group centered genes on each iteration of stability selection. |
betaHat.init |
Matrix of size 2 by m, containing sample means for each group. Row 1 contains sample means for group one, and row 2 contains sample means for group two. |
gammaHat.init |
Vector of length m containing differences in sample means. |
B.inv.list |
List of estimated row-row inverse covariance
matrices, where |
corr.B.inv.list |
List of inverse correlation matrices
corresponding to the inverse covariance matrices
|
betaHat |
List of matrices of size 2 by m, where m is
the number of columns of X. For each matrix, entry (i, j) contains the
estimated mean of the jth column for an individual in
group i, with i = 1,2, and j = 1, ..., m. The matrix
|
gamma.hat |
List of vectors of estimated group mean
differences. The vector |
design.effecs |
Vector containing the estimated design effect for each iteration of stability selection. |
gls.test.stats |
List of vectors of test statistics for each iteration of stability selection. |
p.vals |
List of vectors of two-sided p-values, calculated using the standard normal distribution. |
p.vals.adjusted |
List of vectors of p-values, adjusted using the Benjamini-Hochberg fdr adjustment. |
Examples
# Generate matrix-variate data.
n1 <- 5
n2 <- 5
n <- n1 + n2
group.one.indices <- 1:5
group.two.indices <- 6:10
m <- 20
M <- matrix(0, nrow=n, ncol=m)
# In this example, the first three variables have nonzero
# mean differences.
M[1:n1, 1:3] <- 3
M[(n1 + 1):n2, 1:3] <- -3
X <- matrix(rnorm(n * m), nrow=n, ncol=m) + M
# Apply the stability algorithm.
rowpen <- sqrt(log(m) / n)
n.genes.to.group.center <- c(10, 5, 2)
out <- jointMeanCovStability(
X, group.one.indices, rowpen, c(2e3, n.genes.to.group.center))
# Make quantile plots of the test statistics for each
# iteration of the stability algorithm.
opar <- par(mfrow=c(2, 2), pty="s")
qqnorm(out$gammaHat.init,
main=sprintf("%d genes group centered", m))
abline(a=0, b=1)
qqnorm(out$gammaHat[[1]],
main=sprintf("%d genes group centered",
n.genes.to.group.center[1]))
abline(a=0, b=1)
qqnorm(out$gammaHat[[2]],
main=sprintf("%d genes group centered",
n.genes.to.group.center[2]))
abline(a=0, b=1)
qqnorm(out$gammaHat[[3]],
main=sprintf("%d genes group centered",
n.genes.to.group.center[3]))
abline(a=0, b=1)
par(opar)