R: Lasso Method for the RCON(V, E) Models

sglasso {sglasso}

R Documentation

Lasso Method for the RCON(V, E) Models

Description

Fit the weighted l1-penalized RCON(V, E) models using a cyclic coordinate algorithm.

Usage

sglasso(S, mask, w = NULL, flg = NULL, min_rho = 1.0e-02, nrho = 50,  
        nstep = 1.0e+05, algorithm = c("ccd","ccm"), truncate = 1e-05, 
        tol = 1.0e-03)

Arguments

`S`	the empirical variance/covariance matrix;
`mask`	a symmetric matrix used to specify the equality constraints on the entries of the concentration matrix. See the example bellow for more details;
`w`	a vector specifying the weights used to compute the weighted l1-norm of the parameters of the RCON(V, E) model;
`flg`	a logical vector used to specify if a parameter is penalized, i.e., if `flg[i] = TRUE` then the i-th parameter is penalized, otherwise (`flg[i] = FALSE`) the maximum likelihood estimate is computed;
`min_rho`	last value of the sequence of tuning parameters used to compute the sglasso solution path. If `nrho = 1`, then `min_rho` is the value used to compute the sglasso estimate. Default value is 1.0e-02;
`nrho`	number of tuning parameters used to compute the sglasso solution path. Default is 50;
`nstep`	nonnegative integer used to specify the maximun number of iterations of the two cyclic coordinate algorithms. Default is 1.0e+05;
`algorithm`	character by means of to specify the algorithm used to fit the model, i.e., a cyclic coordinate descente (`ccd`) algorithm or a cyclic coordinate minimization (`ccm`) algorithm. Default is `ccd`;
`truncate`	at convergence all estimates below this value will be set to zero. Default is 1e-05;
`tol`	value used for convergence. Default value is 1.0e-05.

Details

The RCON(V, E) model (Hojsgaard et al., 2008) is a kind of restriction of the Gaussian Graphical Model defined using a coloured graph to specify a set of equality constraints on the entries of the concentration matrix. Roughly speaking, a coloured graph implies a partition of the vertex set into R disjoint subsets, called vertex colour classes, and a partition of the edge set into S disjoint subsets, called edge colour classes. At each vertex/edge colour class is associated a specific colour. If we denote by K = (k_{ij}) the concentration matrix, i.e. the inverse of the variance/covariance matrix \Sigma, the coloured graph implies the following equality constraints:

k_{ii} = \eta_n for any index i belonging to the nth vertex colour class;
k_{ij} = \theta_m for any pair (i,j) belonging to the mth edge colour class.

Denoted with \psi = (\eta',\theta')' the (R+S)-dimensional parameter vector, the concentration matrix can be defined as

K(\psi) = \sum_{n=1}^R\eta_nD_n + \sum_{m=1}^S\theta_mT_m,

where D_n is a diagonal matrix with entries D^n_{ii} = 1 if the index i belongs to the nth vertex colour class and zero otherwise. In the same way, T_m is a symmetrix matrix with entries T^m_{ij} = 1 if the pair (i,j) belongs to the mth edge colour class. Using the previous specification of the concentration matrix, the structured graphical lasso (sglasso) estimator (Abbruzzo et al., 2014) is defined as

\hat\psi = \arg\max_{\psi} \log det K(\psi) - tr\{Sk(\psi)\} - \rho\sum_{m=1}^Sw_m|\theta_m|,

where S is the empirical variance/covariance matrix, \rho is the tuning parameter used to control the ammount of shrinkage and w_m are weights used to define the weighted \ell_1-norm. By default, the sglasso function sets the weights equal to the cardinality of the edge colour classes.

Value

sglasso returns an obejct with S3 class "sglasso", i.e. a named list containing the following components:

`call`	the call that produced this object;
`nv`	number of vertex colour classes;
`ne`	number of edge colour classes;
`theta`	the matrix of the sglasso estimates. The first `nv` rows correspond to the unpenalized parameters while the remaining rows correspond to the weighted l1-penalized parameters;
`w`	the vector of weights used to define the weighted l1-norm;
`df`	`nrho`-dimensional vector of the number of estimated nonzero parameters;
`rho`	`nrho`-dimensional vector of the sequence of tuning parameters;
`grd`	the matrix of the scores;
`nstep`	nonnegative integer used to specify the maximum number of iterations of the algorithms;
`nrho`	number of tuning parameters used to compute the sglasso solution path;
`algorithm`	the algorithm used to fit the model;
`truncate`	the value used to set to zero the estimated parameters;
`tol`	a nonnegative value used to define the convergence of the algorithms;
`S`	the empirical variace/covariance matrix used to compute the sglasso solution path;
`mask`	the `mask` used to define the equality constraints on the entries of the concentration matrix;
`n`	number of interations of the algorithm;
`conv`	an integer value used to encode the warnings related to the algorihtms. If `conv = 0` the convergence has been achieved otherwise the maximum number of iterations has been achieved.

Author(s)

Luigi Augugliaro
Maintainer: Luigi Augugliaro luigi.augugliaro@unipa.it

References

Abbruzzo, A., Augugliaro, L., Mineo, A. M. and Wit, E. C. (2014) Cyclic coordinate for penalized Gaussian Graphical Models with symmetry restrictions. In Proceeding of COMPSTAT 2014 - 21th International Conference on Computational Statistics, Geneva, August 19-24, 2014.

Hojsgaard, S. and Lauritzen, S. L. (2008) Graphical gaussian models with edge and vertex symmetries. J. Roy. Statist. Soc. Ser. B., Vol. 70(5), 1005–1027.

Examples

########################################################
# sglasso solution path
#
## structural zeros:
## there are two ways to specify structural zeros which are 
## related to the kind of mask. If mask is a numeric matrix
## NA is used to identify the structural zero. If mask is a
## character matrix then the structural zeros are specified
## using NA or ".".
N <- 100
p <- 5
X <- matrix(rnorm(N * p), N, p)
S <- crossprod(X) / N
mask <- outer(1:p, 1:p, function(i,j) 0.5^abs(i-j))
mask[1,5] <- mask[1,4] <- mask[2,5] <- NA
mask[5,1] <- mask[4,1] <- mask[5,2] <- NA
mask

out.sglasso_path <- sglasso(S, mask, tol = 1.0e-13)
out.sglasso_path

rho <- out.sglasso_path$rho[20]
out.sglasso <- sglasso(S, mask, nrho = 1, min_rho = rho, tol = 1.0e-13, algorithm = "ccm")
out.sglasso

out.sglasso_path$theta[, 20]
out.sglasso$theta[, 1]

[Package sglasso version 1.2.6 Index]