R: Ridge penalized precision matrix estimation

RIDGEsigma {ADMMsigma}

R Documentation

Ridge penalized precision matrix estimation

Description

Ridge penalized matrix estimation via closed-form solution. If one is only interested in the ridge penalty, this function will be faster and provide a more precise estimate than using ADMMsigma.
Consider the case where X_{1}, ..., X_{n} are iid N_{p}(\mu, \Sigma) and we are tasked with estimating the precision matrix, denoted \Omega \equiv \Sigma^{-1}. This function solves the following optimization problem:

Objective:: \hat{\Omega}_{\lambda} = \arg\min_{\Omega \in S_{+}^{p}} \left\{ Tr\left(S\Omega\right) - \log \det\left(\Omega \right) + \frac{\lambda}{2}\left\| \Omega \right\|_{F}^{2} \right\}

where \lambda > 0 and \left\|\cdot \right\|_{F}^{2} is the Frobenius norm.

Usage

RIDGEsigma(X = NULL, S = NULL, lam = 10^seq(-2, 2, 0.1), path = FALSE,
  K = 5, cores = 1, trace = c("none", "progress", "print"))

Arguments

`X`	option to provide a nxp data matrix. Each row corresponds to a single observation and each column contains n observations of a single feature/variable.
`S`	option to provide a pxp sample covariance matrix (denominator n). If argument is `NULL` and `X` is provided instead then `S` will be computed automatically.
`lam`	positive tuning parameters for ridge penalty. If a vector of parameters is provided, they should be in increasing order. Defaults to grid of values `10^seq(-2, 2, 0.1)`.
`path`	option to return the regularization path. This option should be used with extreme care if the dimension is large. If set to TRUE, cores will be set to 1 and errors and optimal tuning parameters will based on the full sample. Defaults to FALSE.
`K`	specify the number of folds for cross validation.
`cores`	option to run CV in parallel. Defaults to `cores = 1`.
`trace`	option to display progress of CV. Choose one of `progress` to print a progress bar, `print` to print completed tuning parameters, or `none`.

Value

returns class object RIDGEsigma which includes:

`Lambda`	optimal tuning parameter.
`Lambdas`	grid of lambda values for CV.
`Omega`	estimated penalized precision matrix.
`Sigma`	estimated covariance matrix from the penalized precision matrix (inverse of Omega).
`Path`	array containing the solution path. Solutions are ordered dense to sparse.
`Gradient`	gradient of optimization function (penalized gaussian likelihood).
`MIN.error`	minimum average cross validation error (cv.crit) for optimal parameters.
`AVG.error`	average cross validation error (cv.crit) across all folds.
`CV.error`	cross validation errors (cv.crit).

Author(s)

Matt Galloway gall0441@umn.edu

References

Rothman, Adam. 2017. 'STAT 8931 notes on an algorithm to compute the Lasso-penalized Gaussian likelihood precision matrix estimator.'

Examples

# generate data from a sparse matrix
# first compute covariance matrix
S = matrix(0.7, nrow = 5, ncol = 5)
for (i in 1:5){
 for (j in 1:5){
   S[i, j] = S[i, j]^abs(i - j)
 }
 }

# generate 100 x 5 matrix with rows drawn from iid N_p(0, S)
set.seed(123)
Z = matrix(rnorm(100*5), nrow = 100, ncol = 5)
out = eigen(S, symmetric = TRUE)
S.sqrt = out$vectors %*% diag(out$values^0.5)
S.sqrt = S.sqrt %*% t(out$vectors)
X = Z %*% S.sqrt

# ridge penalty no ADMM
RIDGEsigma(X, lam = 10^seq(-5, 5, 0.5))

[Package ADMMsigma version 2.1 Index]