fit_gslope {sgs}R Documentation

Fit a gSLOPE model.

Description

Group SLOPE (gSLOPE) main fitting function. Supports both linear and logistic regression, both with dense and sparse matrix implementations.

Usage

fit_gslope(
  X,
  y,
  groups,
  type = "linear",
  lambda = "path",
  path_length = 20,
  min_frac = 0.05,
  gFDR = 0.1,
  pen_method = 1,
  max_iter = 5000,
  backtracking = 0.7,
  max_iter_backtracking = 100,
  tol = 1e-05,
  standardise = "l2",
  intercept = TRUE,
  screen = TRUE,
  verbose = FALSE,
  w_weights = NULL
)

Arguments

X

Input matrix of dimensions n \times p. Can be a sparse matrix (using class "sparseMatrix" from the Matrix package).

y

Output vector of dimension n. For type="linear" should be continuous and for type="logistic" should be a binary variable.

groups

A grouping structure for the input data. Should take the form of a vector of group indices.

type

The type of regression to perform. Supported values are: "linear" and "logistic".

lambda

The regularisation parameter. Defines the level of sparsity in the model. A higher value leads to sparser models:

  • "path" computes a path of regularisation parameters of length "path_length". The path will begin just above the value at which the first predictor enters the model and will terminate at the value determined by "min_frac".

  • User-specified single value or sequence. Internal scaling is applied based on the type of standardisation. The returned "lambda" value will be the original unscaled value(s).

path_length

The number of \lambda values to fit the model for. If "lambda" is user-specified, this is ignored.

min_frac

Defines the termination point of the pathwise solution, so that \lambda_\text{min} = min_frac \cdot \lambda_\text{max}.

gFDR

Defines the desired group false discovery rate (FDR) level, which determines the shape of the group penalties. Must be between 0 and 1.

pen_method

The type of penalty sequences to use (see Brzyski et al. (2019)):

  • "1" uses the gMean gSLOPE sequence.

  • "2" uses the gMax gSLOPE sequence.

max_iter

Maximum number of ATOS iterations to perform.

backtracking

The backtracking parameter, \tau, as defined in Pedregosa et. al. (2018).

max_iter_backtracking

Maximum number of backtracking line search iterations to perform per global iteration.

tol

Convergence tolerance for the stopping criteria.

standardise

Type of standardisation to perform on X:

  • "l2" standardises the input data to have \ell_2 norms of one. When using this "lambda" is scaled internally by 1/\sqrt{n}.

  • "l1" standardises the input data to have \ell_1 norms of one. When using this "lambda" is scaled internally by 1/n.

  • "sd" standardises the input data to have standard deviation of one.

  • "none" no standardisation applied.

intercept

Logical flag for whether to fit an intercept.

screen

Logical flag for whether to apply screening rules (see Feser and Evangelou (2024)). Screening discards irrelevant groups before fitting, greatly improving speed.

verbose

Logical flag for whether to print fitting information.

w_weights

Optional vector for the group penalty weights. Overrides the penalties from pen_method if specified. When entering custom weights, these are multiplied internally by \lambda and 1-\alpha. To void this behaviour, set \lambda = 2 and \alpha = 0.5.

Details

fit_gslope() fits a gSLOPE model using adaptive three operator splitting (ATOS). gSLOPE is a sparse-group method, so that it selects both variables and groups. Unlike group selection approaches, not every variable within a group is set as active. It solves the convex optimisation problem given by

\frac{1}{2n} f(b ; y, \mathbf{X}) + \lambda \sum_{g=1}^{m}w_g \sqrt{p_g} \|b^{(g)}\|_2,

where the penalty sequences are sorted and f(\cdot) is the loss function. In the case of the linear model, the loss function is given by the mean-squared error loss:

f(b; y, \mathbf{X}) = \left\|y-\mathbf{X}b \right\|_2^2.

In the logistic model, the loss function is given by

f(b;y,\mathbf{X})=-1/n \log(\mathcal{L}(b; y, \mathbf{X})).

where the log-likelihood is given by

\mathcal{L}(b; y, \mathbf{X}) = \sum_{i=1}^{n}\left\{y_i b^\intercal x_i - \log(1+\exp(b^\intercal x_i)) \right\}.

The penalty parameters in gSLOPE are sorted so that the largest group effects are matched with the largest penalties, to reduce the group FDR.

Value

A list containing:

beta

The fitted values from the regression. Taken to be the more stable fit between x and z, which is usually the former. A filter is applied to remove very small values, where ATOS has not been able to shrink exactly to zero. Check this against x and z.

group_effects

The group values from the regression. Taken by applying the \ell_2 norm within each group on beta.

selected_var

A list containing the indicies of the active/selected variables for each "lambda" value.

selected_grp

A list containing the indicies of the active/selected groups for each "lambda" value.

pen_gslope

Vector of the group penalty sequence.

lambda

Value(s) of \lambda used to fit the model.

type

Indicates which type of regression was performed.

standardise

Type of standardisation used.

intercept

Logical flag indicating whether an intercept was fit.

num_it

Number of iterations performed. If convergence is not reached, this will be max_iter.

success

Logical flag indicating whether ATOS converged, according to tol.

certificate

Final value of convergence criteria.

x

The solution to the original problem (see Pedregosa et. al. (2018)).

u

The solution to the dual problem (see Pedregosa et. al. (2018)).

z

The updated values from applying the first proximal operator (see Pedregosa et. al. (2018)).

screen_set

List of groups that were kept after screening step for each "lambda" value. (corresponds to \mathcal{S} in Feser and Evangelou (2024)).

epsilon_set

List of groups that were used for fitting after screening for each "lambda" value. (corresponds to \mathcal{E} in Feser and Evangelou (2024)).

kkt_violations

List of groups that violated the KKT conditions each "lambda" value. (corresponds to \mathcal{K} in Feser and Evangelou (2024)).

screen

Logical flag indicating whether screening was applied.

References

Brzyski, D., Gossmann, A., Su, W., Bodgan, M. (2019). Group SLOPE – Adaptive Selection of Groups of Predictors, https://www.tandfonline.com/doi/full/10.1080/01621459.2017.1411269

Feser, F., Evangelou, M. (2024). Strong screening rules for group-based SLOPE models, https://proceedings.mlr.press/v80/pedregosa18a.html

Pedregosa, F., Gidel, G. (2018). Adaptive Three Operator Splitting, https://proceedings.mlr.press/v80/pedregosa18a.html

See Also

Other gSLOPE-methods: coef.sgs(), fit_gslope_cv(), plot.sgs(), predict.sgs(), print.sgs()

Examples

# specify a grouping structure
groups = c(1,1,1,2,2,3,3,3,4,4)
# generate data
data =  gen_toy_data(p=10, n=5, groups = groups, seed_id=3,group_sparsity=1)
# run gSLOPE 
model = fit_gslope(X = data$X, y = data$y, groups = groups, type="linear", path_length = 5, 
gFDR=0.1, standardise = "l2", intercept = TRUE, verbose=FALSE)

[Package sgs version 0.2.0 Index]