R: Group subset selection

grpsel {grpsel}

R Documentation

Group subset selection

Description

Fits the regularisation surface for a regression model with a group subset selection penalty. The group subset penalty can be combined with either a group lasso or ridge penalty for shrinkage. The group subset parameter is lambda and the group lasso/ridge parameter is gamma.

Usage

grpsel(
  x,
  y,
  group = seq_len(ncol(x)),
  penalty = c("grSubset", "grSubset+grLasso", "grSubset+Ridge"),
  loss = c("square", "logistic"),
  local.search = FALSE,
  orthogonalise = FALSE,
  nlambda = 100,
  lambda.step = 0.99,
  lambda = NULL,
  lambda.factor = NULL,
  ngamma = 10,
  gamma.max = 100,
  gamma.min = 1e-04,
  gamma = NULL,
  gamma.factor = NULL,
  pmax = ncol(x),
  gmax = length(unique(group)),
  eps = 1e-04,
  max.cd.iter = 10000,
  max.ls.iter = 100,
  active.set = TRUE,
  active.set.count = 3,
  sort = TRUE,
  screen = 500,
  warn = TRUE
)

Arguments

`x`	a predictor matrix
`y`	a response vector
`group`	a vector of length `ncol(x)` with the jth element identifying the group that the jth predictor belongs to; alternatively, a list of vectors with the kth vector identifying the predictors that belong to the kth group (useful for overlapping groups)
`penalty`	the type of penalty to apply; one of 'grSubset', 'grSubset+grLasso', or 'grSubset+Ridge'
`loss`	the type of loss function to use; 'square' for linear regression or 'logistic' for logistic regression
`local.search`	a logical indicating whether to perform local search after coordinate descent; typically leads to higher quality solutions
`orthogonalise`	a logical indicating whether to orthogonalise within groups
`nlambda`	the number of group subset selection parameters to evaluate when `lambda` is computed automatically; may evaluate fewer parameters if `pmax` or `gmax` is reached first
`lambda.step`	the step size taken when computing `lambda` from the data; should be a value strictly between 0 and 1; larger values typically lead to a finer grid of subset sizes
`lambda`	an optional list of decreasing sequences of group subset selection parameters; the list should contain a vector for each value of `gamma`
`lambda.factor`	a vector of penalty factors applied to the group subset selection penalty; equal to the group sizes by default
`ngamma`	the number of group lasso or ridge parameters to evaluate when `gamma` is computed automatically
`gamma.max`	the maximum value for `gamma` when `penalty='grSubset+Ridge'`; when `penalty='grSubset+grLasso'` `gamma.max` is computed automatically from the data
`gamma.min`	the minimum value for `gamma` when `penalty='grSubset+Ridge'` and the minimum value for `gamma` as a fraction of `gamma.max` when `penalty='grSubset+grLasso'`
`gamma`	an optional decreasing sequence of group lasso or ridge parameters
`gamma.factor`	a vector of penalty factors applied to the shrinkage penalty; by default, equal to the square root of the group sizes when `penalty='grSubset+grLasso'` or a vector of ones when `penalty='grSubset+Ridge'`
`pmax`	the maximum number of predictors ever allowed to be active; ignored if `lambda` is supplied
`gmax`	the maximum number of groups ever allowed to be active; ignored if `lambda` is supplied
`eps`	the convergence tolerance; convergence is declared when the relative maximum difference in consecutive coefficients is less than `eps`
`max.cd.iter`	the maximum number of coordinate descent iterations allowed per value of `lambda` and `gamma`
`max.ls.iter`	the maximum number of local search iterations allowed per value of `lambda` and `gamma`
`active.set`	a logical indicating whether to use active set updates; typically lowers the run time
`active.set.count`	the number of consecutive coordinate descent iterations in which a subset should appear before running active set updates
`sort`	a logical indicating whether to sort the coordinates before running coordinate descent; required for gradient screening; typically leads to higher quality solutions
`screen`	the number of groups to keep after gradient screening; smaller values typically lower the run time
`warn`	a logical indicating whether to print a warning if the algorithms fail to converge

Details

For linear regression (loss='square') the response and predictors are centred about zero and scaled to unit l2-norm. For logistic regression (loss='logistic') only the predictors are centred and scaled and an intercept is fit during the course of the algorithm.

Value

An object of class grpsel; a list with the following components:

`beta`	a list of matrices whose columns contain fitted coefficients for a given value of `lambda`; an individual matrix in the list for each value of `gamma`
`gamma`	a vector containing the values of `gamma` used in the fit
`lambda`	a list of vectors containing the values of `lambda` used in the fit; an individual vector in the list for each value of `gamma`
`np`	a list of vectors containing the number of active predictors per value of `lambda`; an individual vector in the list for each value of `gamma`
`ng`	a list of vectors containing the the number of active groups per value of `lambda`; an individual vector in the list for each value of `gamma`
`iter.cd`	a list of vectors containing the number of coordinate descent iterations per value of `lambda`; an individual vector in the list for each value of `gamma`
`iter.ls`	a list of vectors containing the number of local search iterations per value of `lambda`; an individual vector in the list for each value of `gamma`
`loss`	a list of vectors containing the evaluated loss function per value of `lambda` evaluated; an individual vector in the list for each value of `gamma`

Author(s)

Ryan Thompson <ryan.thompson@monash.edu>

References

Thompson, R. and Vahid, F. (2021). 'Group selection and shrinkage with application to sparse semiparametric modeling'. arXiv: 2105.12081.

Examples

# Grouped data
set.seed(123)
n <- 100
p <- 10
g <- 5
group <- rep(1:g, each = p / g)
beta <- numeric(p)
beta[which(group %in% 1:2)] <- 1
x <- matrix(rnorm(n * p), n, p)
y <- rnorm(n, x %*% beta)
newx <- matrix(rnorm(p), ncol = p)

# Group subset selection
fit <- grpsel(x, y, group)
plot(fit)
coef(fit, lambda = 0.05)
predict(fit, newx, lambda = 0.05)

# Group subset selection with group lasso shrinkage
fit <- grpsel(x, y, group, penalty = 'grSubset+grLasso')
plot(fit, gamma = 0.05)
coef(fit, lambda = 0.05, gamma = 0.1)
predict(fit, newx, lambda = 0.05, gamma = 0.1)

# Group subset selection with ridge shrinkage
fit <- grpsel(x, y, group, penalty = 'grSubset+Ridge')
plot(fit, gamma = 0.05)
coef(fit, lambda = 0.05, gamma = 0.1)
predict(fit, newx, lambda = 0.05, gamma = 0.1)

[Package grpsel version 1.3.1 Index]