fit_sgs {sgs}R Documentation

fit an SGS model

Description

Sparse-group SLOPE (SGS) main fitting function. Supports both linear and logistic regression, both with dense and sparse matrix implementations.

Usage

fit_sgs(
  X,
  y,
  groups,
  pen_method = 1,
  type = "linear",
  lambda,
  alpha = 0.95,
  vFDR = 0.1,
  gFDR = 0.1,
  max_iter = 5000,
  backtracking = 0.7,
  max_iter_backtracking = 100,
  tol = 1e-05,
  standardise = "l2",
  intercept = TRUE,
  w_weights = NULL,
  v_weights = NULL,
  x0 = NULL,
  u = NULL,
  verbose = FALSE
)

Arguments

X

Input matrix of dimensions n \times p. Can be a sparse matrix (using class "sparseMatrix" from the Matrix package).

y

Output vector of dimension n. For type="linear" should be continuous and for type="logistic" should be a binary variable.

groups

A grouping structure for the input data. Should take the form of a vector of group indices.

pen_method

The type of penalty sequences to use (see Feser et al. (2023)):

  • "1" uses the vMean SGS and gMean gSLOPE sequences.

  • "2" uses the vMax SGS and gMean gSLOPE sequences.

  • "3" uses the BH SLOPE and gMean gSLOPE sequences, also known as SGS Original.

type

The type of regression to perform. Supported values are: "linear" and "logistic".

lambda

The value of \lambda, which defines the level of sparsity in the model. Can be picked using cross-validation (see fit_sgs_cv()). Must be a positive value.

alpha

The value of \alpha, which defines the convex balance between SLOPE and gSLOPE. Must be between 0 and 1.

vFDR

Defines the desired variable false discovery rate (FDR) level, which determines the shape of the variable penalties. Must be between 0 and 1.

gFDR

Defines the desired group false discovery rate (FDR) level, which determines the shape of the group penalties. Must be between 0 and 1.

max_iter

Maximum number of ATOS iterations to perform.

backtracking

The backtracking parameter, \tau, as defined in Pedregosa et. al. (2018).

max_iter_backtracking

Maximum number of backtracking line search iterations to perform per global iteration.

tol

Convergence tolerance for the stopping criteria.

standardise

Type of standardisation to perform on X:

  • "l2" standardises the input data to have \ell_2 norms of one.

  • "l1" standardises the input data to have \ell_1 norms of one.

  • "sd" standardises the input data to have standard deviation of one.

  • "none" no standardisation applied.

intercept

Logical flag for whether to fit an intercept.

w_weights

Optional vector for the group penalty weights. Overrides the penalties from pen_method if specified. When entering custom weights, these are multiplied internally by \lambda and 1-\alpha. To void this behaviour, set \lambda = 2 and \alpha = 0.5.

v_weights

Optional vector for the variable penalty weights. Overrides the penalties from pen_method if specified. When entering custom weights, these are multiplied internally by \lambda and \alpha. To void this behaviour, set \lambda = 2 and \alpha = 0.5.

x0

Optional initial vector for x_0.

u

Optional initial vector for u.

verbose

Logical flag for whether to print fitting information.

Details

fit_sgs() fits an SGS model using adaptive three operator splitting (ATOS). SGS is a sparse-group method, so that it selects both variables and groups. Unlike group selection approaches, not every variable within a group is set as active. It solves the convex optimisation problem given by

\frac{1}{2n} f(b ; y, \mathbf{X}) + \lambda \alpha \sum_{i=1}^{p}v_i |b|_{(i)} + \lambda (1-\alpha)\sum_{g=1}^{m}w_g \sqrt{p_g} \|b^{(g)}\|_2,

where f(\cdot) is the loss function. In the case of the linear model, the loss function is given by the mean-squared error loss:

f(b; y, \mathbf{X}) = \left\|y-\mathbf{X}b \right\|_2^2.

In the logistic model, the loss function is given by

f(b;y,\mathbf{X})=-1/n \log(\mathcal{L}(b; y, \mathbf{X})).

where the log-likelihood is given by

\mathcal{L}(b; y, \mathbf{X}) = \sum_{i=1}^{n}\left\{y_i b^\intercal x_i - \log(1+\exp(b^\intercal x_i)) \right\}.

SGS can be seen to be a convex combination of SLOPE and gSLOPE, balanced through alpha, such that it reduces to SLOPE for alpha = 0 and to gSLOPE for alpha = 1. The penalty parameters in SGS are sorted so that the largest coefficients are matched with the largest penalties, to reduce the FDR.

Value

A list containing:

beta

The fitted values from the regression. Taken to be the more stable fit between x and u, which is usually the former.

x

The solution to the original problem (see Pedregosa et. al. (2018)).

u

The solution to the dual problem (see Pedregosa et. al. (2018)).

z

The updated values from applying the first proximal operator (see Pedregosa et. al. (2018)).

type

Indicates which type of regression was performed.

pen_slope

Vector of the variable penalty sequence.

pen_gslope

Vector of the group penalty sequence.

lambda

Value of \lambda used to fit the model.

success

Logical flag indicating whether ATOS converged, according to tol.

num_it

Number of iterations performed. If convergence is not reached, this will be max_iter.

certificate

Final value of convergence criteria.

intercept

Logical flag indicating whether an intercept was fit.

References

F. Feser, M. Evangelou Sparse-group SLOPE: adaptive bi-level selection with FDR-control, https://arxiv.org/abs/2305.09467

F. Pedregosa, G. Gidel (2018) Adaptive Three Operator Splitting, https://proceedings.mlr.press/v80/pedregosa18a.html

Examples

# specify a grouping structure
groups = c(1,1,1,2,2,3,3,3,4,4)
# generate data
data = generate_toy_data(p=10, n=5, groups = groups, seed_id=3,group_sparsity=1)
# run SGS 
model = fit_sgs(X = data$X, y = data$y, groups = groups, type="linear", lambda = 1, alpha=0.95, 
vFDR=0.1, gFDR=0.1, standardise = "l2", intercept = TRUE, verbose=FALSE)

[Package sgs version 0.1.1 Index]