R: Inference of Multiscale Gaussian Graphical Model.

mglasso {mglasso}

R Documentation

Inference of Multiscale Gaussian Graphical Model.

Description

Cluster variables using L2 fusion penalty and simultaneously estimates a gaussian graphical model structure with the addition of L1 sparsity penalty.

Usage

mglasso(
  x,
  lambda1 = 0,
  fuse_thresh = 0.001,
  maxit = NULL,
  distance = c("euclidean", "relative"),
  lambda2_start = 1e-04,
  lambda2_factor = 1.5,
  precision = 0.01,
  weights_ = NULL,
  type = c("initial"),
  compact = TRUE,
  verbose = FALSE
)

Arguments

`x`	Numeric matrix (`n x p`). Multivariate normal sample with `n` independent observations.
`lambda1`	Positive numeric scalar. Lasso penalty.
`fuse_thresh`	Positive numeric scalar. Threshold for clusters fusion.
`maxit`	Integer scalar. Maximum number of iterations.
`distance`	Character. Distance between regression vectors with permutation on symmetric coefficients.
`lambda2_start`	Numeric scalar. Starting value for fused-group Lasso penalty (clustering penalty).
`lambda2_factor`	Numeric scalar. Step used to update fused-group Lasso penalty in a multiplicative way..
`precision`	Tolerance for the stopping criterion (duality gap).
`weights_`	Matrix of weights.
`type`	If "initial" use classical version of MGLasso without weights.
`compact`	Logical scalar. If TRUE, only save results when previous clusters are different from current.
`verbose`	Logical scalar. Print trace. Default value is FALSE.

Details

Estimates a gaussian graphical model structure while hierarchically grouping variables by optimizing a pseudo-likelihood function combining Lasso and fuse-group Lasso penalties. The problem is solved via the COntinuation with NEsterov smoothing in a Shrinkage-Thresholding Algorithm (Hadj-Selem et al. 2018). Varying the fusion penalty \lambda_2 in a multiplicative fashion allow to uncover a seemingly hierarchical clustering structure. For \lambda_2 = 0, the approach is theoretically equivalent to the Meinshausen-Bühlmann (2006) neighborhood selection as resuming to the optimization of pseudo-likelihood function with \ell_1 penalty (Rocha et al., 2008). The algorithm stops when all the variables have merged into one cluster. The criterion used to merge clusters is the \ell_2-norm distance between regression vectors.

For each iteration of the algorithm, the following function is optimized :

1/2 \sum_{i=1}^p || X^i - X^{\ i} \beta^i ||_2 ^2 + \lambda_1 \sum_{i = 1}^p || \beta^i ||_1 + \lambda_2 \sum_{i < j} || \beta^i - \tau_{ij}(\beta^j) ||_2.

where \beta^i is the vector of coefficients obtained after regression X^i on the others and \tau_{ij} is a permutation exchanging \beta_j^i with \beta_i^j.

Value

A list-like object of class mglasso is returned.

`out`	List of lists. Each element of the list corresponds to a clustering level. An element named `levelk` contains the regression matrix `beta` and clusters vector `clusters` for a clustering in `k` clusters. When `compact = TRUE` `out` has as many elements as the number of unique partitions. When set to `FALSE`, the function returns as many items as the the range of values taken by `lambda2`.
`l1`	the sparsity penalty `lambda1` used in the problem solving.

Examples

## Not run: 
reticulate::use_condaenv("rmglasso", required = TRUE)
n = 50
K = 3
p = 9
rho = 0.85
blocs <- list()
for (j in 1:K) {
  bloc <- matrix(rho, nrow = p/K, ncol = p/K)
  for(i in 1:(p/K)) { bloc[i,i] <- 1 }
  blocs[[j]] <- bloc
}

mat.covariance <- Matrix::bdiag(blocs)
mat.covariance

set.seed(11)
X <- mvtnorm::rmvnorm(n, mean = rep(0,p), sigma = as.matrix(mat.covariance))
X <- scale(X)

res <- mglasso(X, 0.1, lambda2_start = 0.1)
res$out[[1]]$clusters
res$out[[1]]$beta

## End(Not run)

[Package mglasso version 0.1.2 Index]