grpsel {grpsel} | R Documentation |
Group subset selection
Description
Fits the regularisation surface for a regression model with a group subset selection
penalty. The group subset penalty can be combined with either a group lasso or ridge penalty
for shrinkage. The group subset parameter is lambda
and the group lasso/ridge parameter is
gamma
.
Usage
grpsel(
x,
y,
group = seq_len(ncol(x)),
penalty = c("grSubset", "grSubset+grLasso", "grSubset+Ridge"),
loss = c("square", "logistic"),
local.search = FALSE,
orthogonalise = FALSE,
nlambda = 100,
lambda.step = 0.99,
lambda = NULL,
lambda.factor = NULL,
ngamma = 10,
gamma.max = 100,
gamma.min = 1e-04,
gamma = NULL,
gamma.factor = NULL,
pmax = ncol(x),
gmax = length(unique(group)),
eps = 1e-04,
max.cd.iter = 10000,
max.ls.iter = 100,
active.set = TRUE,
active.set.count = 3,
sort = TRUE,
screen = 500,
warn = TRUE
)
Arguments
x |
a predictor matrix |
y |
a response vector |
group |
a vector of length |
penalty |
the type of penalty to apply; one of 'grSubset', 'grSubset+grLasso', or 'grSubset+Ridge' |
loss |
the type of loss function to use; 'square' for linear regression or 'logistic' for logistic regression |
local.search |
a logical indicating whether to perform local search after coordinate descent; typically leads to higher quality solutions |
orthogonalise |
a logical indicating whether to orthogonalise within groups |
nlambda |
the number of group subset selection parameters to evaluate when |
lambda.step |
the step size taken when computing |
lambda |
an optional list of decreasing sequences of group subset selection parameters; the
list should contain a vector for each value of |
lambda.factor |
a vector of penalty factors applied to the group subset selection penalty; equal to the group sizes by default |
ngamma |
the number of group lasso or ridge parameters to evaluate when |
gamma.max |
the maximum value for |
gamma.min |
the minimum value for |
gamma |
an optional decreasing sequence of group lasso or ridge parameters |
gamma.factor |
a vector of penalty factors applied to the shrinkage penalty; by default,
equal to the square root of the group sizes when |
pmax |
the maximum number of predictors ever allowed to be active; ignored if |
gmax |
the maximum number of groups ever allowed to be active; ignored if |
eps |
the convergence tolerance; convergence is declared when the relative maximum
difference in consecutive coefficients is less than |
max.cd.iter |
the maximum number of coordinate descent iterations allowed per value of
|
max.ls.iter |
the maximum number of local search iterations allowed per value of
|
active.set |
a logical indicating whether to use active set updates; typically lowers the run time |
active.set.count |
the number of consecutive coordinate descent iterations in which a subset should appear before running active set updates |
sort |
a logical indicating whether to sort the coordinates before running coordinate descent; required for gradient screening; typically leads to higher quality solutions |
screen |
the number of groups to keep after gradient screening; smaller values typically lower the run time |
warn |
a logical indicating whether to print a warning if the algorithms fail to converge |
Details
For linear regression (loss='square'
) the response and predictors are centred
about zero and scaled to unit l2-norm. For logistic regression (loss='logistic'
) only the
predictors are centred and scaled and an intercept is fit during the course of the algorithm.
Value
An object of class grpsel
; a list with the following components:
beta |
a list of matrices whose columns contain fitted coefficients for a given value of
|
gamma |
a vector containing the values of |
lambda |
a list of vectors containing the values of |
np |
a list of vectors containing the number of active predictors per value of
|
ng |
a list of vectors containing the the number of active groups per value of
|
iter.cd |
a list of vectors containing the number of coordinate descent iterations per value
of |
iter.ls |
a list of vectors containing the number of local search iterations per value
of |
loss |
a list of vectors containing the evaluated loss function per value of |
Author(s)
Ryan Thompson <ryan.thompson@monash.edu>
References
Thompson, R. and Vahid, F. (2021). 'Group selection and shrinkage with application to sparse semiparametric modeling'. arXiv: 2105.12081.
Examples
# Grouped data
set.seed(123)
n <- 100
p <- 10
g <- 5
group <- rep(1:g, each = p / g)
beta <- numeric(p)
beta[which(group %in% 1:2)] <- 1
x <- matrix(rnorm(n * p), n, p)
y <- rnorm(n, x %*% beta)
newx <- matrix(rnorm(p), ncol = p)
# Group subset selection
fit <- grpsel(x, y, group)
plot(fit)
coef(fit, lambda = 0.05)
predict(fit, newx, lambda = 0.05)
# Group subset selection with group lasso shrinkage
fit <- grpsel(x, y, group, penalty = 'grSubset+grLasso')
plot(fit, gamma = 0.05)
coef(fit, lambda = 0.05, gamma = 0.1)
predict(fit, newx, lambda = 0.05, gamma = 0.1)
# Group subset selection with ridge shrinkage
fit <- grpsel(x, y, group, penalty = 'grSubset+Ridge')
plot(fit, gamma = 0.05)
coef(fit, lambda = 0.05, gamma = 0.1)
predict(fit, newx, lambda = 0.05, gamma = 0.1)