R: Estimate Mean and Correlation Structure Using Stability...

jointMeanCovStability {jointMeanCov}

R Documentation

Estimate Mean and Correlation Structure Using Stability Selection

Description

Given a data matrix, this function performs stability selection as described in the section "Stability of Gene Sets" in the paper Joint mean and covariance estimation with unreplicated matrix-variate data (2018), M. Hornstein, R. Fan, K. Shedden, and S. Zhou; Journal of the American Statistical Association.

Usage

jointMeanCovStability(X, group.one.indices, rowpen,
  n.genes.to.group.center = NULL)

Arguments

`X`	Data matrix of size n by m.
`group.one.indices`	Vector of indices denoting rows in group one.
`rowpen`	Glasso penalty for estimating B, the row-row correlation matrix.
`n.genes.to.group.center`	Vector specifying the number of genes to group center on each iteration of the stability selection algorithm. The length of this vector is equal to the number of iterations of stability selection. If this argument is not provided, the default is a decreasing sequence starting with m, followed

Details

Let m[i] denote the number of group-centered genes on the ith iteration of stability selection (where m[i] is a decreasing sequence). Estimated group means are initialized using unweighted sample means. Then, for each iteration of stability selection: 1. The top m[i] genes are selected for group centering by ranking the estimated group differences from the previous iteration. 2. Group means and global means are estimated using GLS, using the inverse row covariance matrix from the previous iteration. The centered data matrix is then used as input to Gemini to estimate the inverse row covariance matrix B.hat.inv. 3. Group means are estimated using GLS with B.hat.inv.

Value

`n.genes.to.group.center`	Number of group centered genes on each iteration of stability selection.
`betaHat.init`	Matrix of size 2 by m, containing sample means for each group. Row 1 contains sample means for group one, and row 2 contains sample means for group two.
`gammaHat.init`	Vector of length m containing differences in sample means.
`B.inv.list`	List of estimated row-row inverse covariance matrices, where `B.inv.list[[i]]` corresponds to the estimate from the ith iteration of the algorithm, in which the number of group-centered genes is `n.genes.to.group.center[i]`. For identifiability, A and B are scaled so that A has trace m.
`corr.B.inv.list`	List of inverse correlation matrices corresponding to the inverse covariance matrices `B.inv.list`.
`betaHat`	List of matrices of size 2 by m, where m is the number of columns of X. For each matrix, entry (i, j) contains the estimated mean of the jth column for an individual in group i, with i = 1,2, and j = 1, ..., m. The matrix `betaHat[[i]]` contains the estimates for the ith iteration of stability selection.
`gamma.hat`	List of vectors of estimated group mean differences. The vector `gammaHat[[i]]` contains estimates for the ith iteration of stability selection.
`design.effecs`	Vector containing the estimated design effect for each iteration of stability selection.
`gls.test.stats`	List of vectors of test statistics for each iteration of stability selection.
`p.vals`	List of vectors of two-sided p-values, calculated using the standard normal distribution.
`p.vals.adjusted`	List of vectors of p-values, adjusted using the Benjamini-Hochberg fdr adjustment.

Examples

# Generate matrix-variate data.
n1 <- 5
n2 <- 5
n <- n1 + n2
group.one.indices <- 1:5
group.two.indices <- 6:10
m <- 20
M <- matrix(0, nrow=n, ncol=m)
# In this example, the first three variables have nonzero
# mean differences.
M[1:n1, 1:3] <- 3
M[(n1 + 1):n2, 1:3] <- -3
X <- matrix(rnorm(n * m), nrow=n, ncol=m) + M

# Apply the stability algorithm.
rowpen <- sqrt(log(m) / n)
n.genes.to.group.center <- c(10, 5, 2)
out <- jointMeanCovStability(
 X, group.one.indices, rowpen, c(2e3, n.genes.to.group.center))

# Make quantile plots of the test statistics for each
# iteration of the stability algorithm.
opar <- par(mfrow=c(2, 2), pty="s")
qqnorm(out$gammaHat.init,
  main=sprintf("%d genes group centered", m))
abline(a=0, b=1)
qqnorm(out$gammaHat[[1]],
  main=sprintf("%d genes group centered",
   n.genes.to.group.center[1]))
abline(a=0, b=1)
qqnorm(out$gammaHat[[2]],
  main=sprintf("%d genes group centered",
   n.genes.to.group.center[2]))
abline(a=0, b=1)
qqnorm(out$gammaHat[[3]],
  main=sprintf("%d genes group centered",
   n.genes.to.group.center[3]))
abline(a=0, b=1)
par(opar)

[Package jointMeanCov version 0.1.0 Index]