R: Fits quantile regression models using a group penalized...

rq.group.pen {rqPen}

R Documentation

Fits quantile regression models using a group penalized objective function.

Description

Let the predictors be divided into G groups with G corresponding vectors of coefficients, \beta_1,\ldots,\beta_G. Let \rho_\tau(a) = a[\tau-I(a<0)]. Fits quantile regression models for Q quantiles by minimizing the penalized objective function of

\sum_{q=1}^Q \frac{1}{n} \sum_{i=1}^n m_i \rho_\tau(y_i-x_i^\top\beta^q) + \sum_{q=1}^Q \sum_{g=1}^G P(||\beta^q_g||_k,w_q*v_j*\lambda,a).

Where w_q and v_j are designated by penalty.factor and tau.penalty.factor respectively and m_i can be set by weights. The value of k is chosen by norm. Value of P() depends on the penalty. Briefly, but see references or vignette for more details,

Group LASSO (gLASSO): P(||\beta||_k,\lambda,a)=\lambda||\beta||_k
Group SCAD: P(||\beta||_k,\lambda,a)=SCAD(||\beta||_k,\lambda,a)
Group MCP: P(||\beta||_k,\lambda,a)=MCP(||\beta||_k,\lambda,a)
Group Adaptive LASSO: P(||\beta||_k,\lambda,a)=\frac{\lambda ||\beta||_k}{|\beta_0|^a}

Note if k=1 and the group lasso penalty is used then this is identical to the regular lasso and thus function will stop and suggest that you use rq.pen() instead. For Adaptive LASSO the values of \beta_0 come from a Ridge solution with the same value of \lambda. If the Huber algorithm is used than \rho_\tau(y_i-x_i^\top\beta) is replaced by a Huber-type approximation. Specifically, it is replaced by h^\tau_\gamma(y_i-x_i^\top\beta)/2 where

h^\tau_\gamma(a) = a^2/(2\gamma)I(|a| \leq \gamma) + (|a|-\gamma/2)I(|a|>\gamma)+(2\tau-1)a.

Where if \tau=.5, we get the usual Huber loss function.

Usage

rq.group.pen(
  x,
  y,
  tau = 0.5,
  groups = 1:ncol(x),
  penalty = c("gLASSO", "gAdLASSO", "gSCAD", "gMCP"),
  lambda = NULL,
  nlambda = 100,
  eps = ifelse(nrow(x) < ncol(x), 0.05, 0.01),
  alg = c("huber", "br"),
  a = NULL,
  norm = 2,
  group.pen.factor = NULL,
  tau.penalty.factor = rep(1, length(tau)),
  scalex = TRUE,
  coef.cutoff = 1e-08,
  max.iter = 500,
  converge.eps = 1e-04,
  gamma = IQR(y)/10,
  lambda.discard = TRUE,
  weights = NULL,
  ...
)

Arguments

`x`	Matrix of predictors.
`y`	Vector of responses.
`tau`	Vector of quantiles.
`groups`	Vector of group assignments for predictors.
`penalty`	Penalty used, choices are group lasso ("gLASSO"), group adaptive lasso ("gAdLASSO"), group SCAD ("gSCAD") and group MCP ("gMCP")
`lambda`	Vector of lambda tuning parameters. Will be autmoatically generated if it is not set.
`nlambda`	The number of lambda tuning parameters.
`eps`	The value to be multiplied by the largest lambda value to determine the smallest lambda value.
`alg`	Algorithm used. Choices are Huber approximation ("huber") or linear programming ("lp").
`a`	The additional tuning parameter for adaptive lasso, SCAD and MCP.
`norm`	Whether a L1 or L2 norm is used for the grouped coefficients.
`group.pen.factor`	Penalty factor for each group. Default is 1 for all groups if norm=1 and square root of group size if norm=2.
`tau.penalty.factor`	Penalty factor for each quantile.
`scalex`	Whether X should be centered and scaled so that the columns have mean zero and standard deviation of one. If set to TRUE, the coefficients will be returned to the original scale of the data.
`coef.cutoff`	Coefficient cutoff where any value below this number is set to zero. Useful for the lp algorithm, which are prone to finding almost, but not quite, sparse solutions.
`max.iter`	The maximum number of iterations for the algorithm.
`converge.eps`	The convergence criteria for the algorithms.
`gamma`	The tuning parameter for the Huber loss.
`lambda.discard`	Whether lambdas should be discarded if for small values of lambda there is very little change in the solutions.
`weights`	Weights used in the quanitle loss objective function.
`...`	Additional parameters

Value

An rq.pen.seq object.

models: A list of each model fit for each tau and a combination.
n: Sample size.
p: Number of predictors.
alg: Algorithm used.
tau: Quantiles modeled.
penalty: Penalty used.
a: Tuning parameters a used.
lambda: Lambda values used for all models. If a model has fewer coefficients than lambda, say k. Then it used the first k values of lambda. Setting lambda.discard to TRUE will gurantee all values use the same lambdas, but may increase computational time noticeably and for little gain.
modelsInfo: Information about the quantile and a value for each model.
call: Original call.

Each model in the models list has the following values.

coefficients: Coefficients for each value of lambda.
rho: The unpenalized objective function for each value of lambda.
PenRho: The penalized objective function for each value of lambda.
nzero: The number of nonzero coefficients for each value of lambda.
tau: Quantile of the model.
a: Value of a for the penalized loss function.

Author(s)

Ben Sherwood, ben.sherwood@ku.edu, Shaobo Li shaobo.li@ku.edu and Adam Maidman

References

Peng B, Wang L (2015). “An iterative coordinate descent algorithm for high-dimensional nonconvex penalized quantile regression.” J. Comput. Graph. Statist., 24(3), 676-694.

Examples

## Not run:  
set.seed(1)
x <- matrix(rnorm(200*8,sd=1),ncol=8)
y <- 1 + x[,1] + 3*x[,3] - x[,8] + rt(200,3)
g <- c(1,1,1,2,2,2,3,3)
tvals <- c(.25,.75)
r1 <- rq.group.pen(x,y,groups=g)
r5 <- rq.group.pen(x,y,groups=g,tau=tvals)
#Linear programming approach with group SCAD penalty and L1-norm
m2 <- rq.group.pen(x,y,groups=g,alg="br",penalty="gSCAD",norm=1,a=seq(3,4))
# No penalty for the first group
m3 <- rq.group.pen(x,y,groups=g,group.pen.factor=c(0,rep(1,2)))
# Smaller penalty for the median
m4 <- rq.group.pen(x,y,groups=g,tau=c(.25,.5,.75),tau.penalty.factor=c(1,.25,1))

## End(Not run)

[Package rqPen version 4.1.1 Index]