grpSLOPE {grpSLOPE} | R Documentation |
Group SLOPE (Group Sorted L-One Penalized Estimation)
Description
Performs selection of significant groups of predictors and estimation of the corresponding coefficients using the Group SLOPE method (see Brzyski et. al., 2016).
Usage
grpSLOPE(
X,
y,
group,
fdr,
lambda = "corrected",
sigma = NULL,
verbose = FALSE,
orthogonalize = NULL,
normalize = TRUE,
max.iter = 10000,
dual.gap.tol = 1e-06,
infeas.tol = 1e-06,
x.init = NULL,
...
)
Arguments
X |
The model matrix |
y |
The response variable |
group |
A vector describing the grouping structure. It should contain a group id for each predictor variable. |
fdr |
Target group false discovery rate (gFDR) |
lambda |
Method used to obtain the regularizing sequence lambda. Possible
values are "max", "mean", and "corrected" (default).
See |
sigma |
Noise level. If ommited, estimated from the data, using Procedure 2 in Brzyski et. al. (2016). |
verbose |
A |
orthogonalize |
Whether to orthogonalize the model matrix within each group. Do not set manually unless you are certain that your data is appropriately pre-processed. |
normalize |
Whether to center the input data and re-scale the columns of the design matrix to have unit norms. Do not disable this unless you are certain that your data are appropriately pre-processed. |
max.iter |
|
dual.gap.tol |
|
infeas.tol |
|
x.init |
|
... |
Options passed to |
Details
Multiple methods are available to generate the regularizing sequence lambda
,
see lambdaGroupSLOPE
for detail.
The model matrix is transformed by orthogonalization within each group (see Section 2.1
in Brzyski et. al., 2016), and penalization is imposed on \| X_{I_i} \beta_{I_i} \|
.
When orthogonalize = TRUE
, due to within group orthogonalization,
the solution vector beta
cannot be computed, if a group submatrix does not have full
column rank (e.g., if there are more predictors in a selected group than there are observations).
In that case only the solution vector c
of the transformed (orthogonalized) model is returned.
Additionally, in any case the vector group.norms
is returned with its i
th entry
being \| X_{I_i} \beta_{I_i} \|
, i.e., the overall effect of each group.
Note that all of these results are returned on the scale of the normalized versions of X
and y
.
However, original.scale
contains the regression coefficients transformed to correspond to
the original (unaltered) X
and y
. In that case, an estimate for the intercept term is also
returned with the other coefficients in original.scale
(while on the normalized scale the estimate
of the intercept is always equal to zero, and is not explicitly provided in the grpSLOPE
output).
Value
A list with members:
- beta
Solution vector. See Details.
- c
Solution vector of the transformed model. See Details.
- group.norms
Overall effect of each group. See Details.
- selected
Names of selected groups (i.e., groups of predictors with at least one non-zero coefficient estimate)
- optimal
Convergence status
- iter
Iterations of the proximal gradient method
- lambda
Regularizing sequence
- lambda.method
Method used to construct the regularizing sequence
- sigma
(Estimated) noise level
- group
The provided grouping structure (corresponding to
beta
)- group.c
Grouping structure of the transformed model (corresponding to
c
)- original.scale
A list containing the estimated intercept and regression coefficients on the original scale. See Details.
References
D. Brzyski, A. Gossmann, W. Su, and M. Bogdan (2016) Group SLOPE – adaptive selection of groups of predictors, https://arxiv.org/abs/1610.04960
D. Brzyski, A. Gossmann, W. Su, and M. Bogdan (2019) Group SLOPE – adaptive selection of groups of predictors. Journal of the American Statistical Association 114 (525): 419–33.
Examples
# generate some data
set.seed(1)
A <- matrix(rnorm(100^2), 100, 100)
grp <- rep(rep(1:20), each=5)
b <- c(runif(20), rep(0, 80))
# (i.e., groups 1, 2, 3, 4, are truly significant)
y <- A %*% b + rnorm(10)
fdr <- 0.1 # target false discovery rate
# fit a Group SLOPE model
result <- grpSLOPE(X=A, y=y, group=grp, fdr=fdr)
result$selected
# [1] "1" "2" "3" "4" "14"
result$sigma
# [1] 0.7968632
head(result$group.norms)
# 1 2 3 4 5 6
# 2.905449 5.516103 8.964201 10.253792 0.000000 0.000000