rq.gq.pen {rqPen}R Documentation

Title Quantile regression estimation and consistent variable selection across multiple quantiles

Description

Uses the group lasso penalty across the quantiles to provide consistent selection across all, K, modeled quantiles. Let \beta^q be the coefficients for the kth quantiles, \beta_j be the Q-dimensional vector of the jth coefficient for each quantile, and \rho_\tau(u) is the quantile loss function. The method minimizes

\sum_{q=1}^Q \frac{1}{n} \sum_{i=1}^n m_i \rho_\tau(y_i-x_i^\top\beta^q) + \lambda \sum_{j=1}^p ||\beta_j||_{2,w} .

Uses a Huber approximation in the fitting of model, as presented in Sherwood and Li (2022). Where,

||\beta_j||_{2,w} = \sqrt{\sum_{k=1}^K w_kv_j\beta_{kj}^2},

where w_k is a quantile weight that can be specified by tau.penalty.factor, v_j is a predictor weight that can be assigned by penalty.factor, and m_i is an observation weight that can be set by weights.

Usage

rq.gq.pen(
  x,
  y,
  tau,
  lambda = NULL,
  nlambda = 100,
  eps = ifelse(nrow(x) < ncol(x), 0.01, 0.001),
  weights = NULL,
  penalty.factor = NULL,
  scalex = TRUE,
  tau.penalty.factor = NULL,
  gmma = 0.2,
  max.iter = 200,
  lambda.discard = TRUE,
  converge.eps = 1e-04,
  beta0 = NULL
)

Arguments

x

covariate matrix

y

a univariate response variable

tau

a sequence of quantiles to be modeled, must be of at least length 3.

lambda

shrinkage parameter. Default is NULL, and the algorithm provides a solution path.

nlambda

Number of lambda values to be considered.

eps

If not pre-specified the lambda vector will be from lambda_max to lambda_max times eps

weights

observation weights. Default is NULL, which means equal weights.

penalty.factor

weights for the shrinkage parameter for each covariate. Default is equal weight.

scalex

Whether x should be scaled before fitting the model. Coefficients are returned on the original scale.

tau.penalty.factor

weights for different quantiles. Default is equal weight.

gmma

tuning parameter for the Huber loss

max.iter

maximum number of iteration. Default is 200.

lambda.discard

Default is TRUE, meaning that the solution path stops if the relative deviance changes sufficiently small. It usually happens near the end of solution path. However, the program returns at least 70 models along the solution path.

converge.eps

The epsilon level convergence. Default is 1e-4.

beta0

Initial estimates. Default is NULL, and the algorithm starts with the intercepts being the quantiles of response variable and other coefficients being zeros.

Value

An rq.pen.seq object.

models:

A list of each model fit for each tau and a combination.

n:

Sample size.

p:

Number of predictors.

alg:

Algorithm used. Options are "huber" or any method implemented in rq(), such as "br".

tau:

Quantiles modeled.

a:

Tuning parameters a used.

modelsInfo:

Information about the quantile and a value for each model.

lambda:

Lambda values used for all models. If a model has fewer coefficients than lambda, say k. Then it used the first k values of lambda. Setting lambda.discard to TRUE will gurantee all values use the same lambdas, but may increase computational time noticeably and for little gain.

penalty:

Penalty used.

call:

Original call.

Each model in the models list has the following values.

coefficients:

Coefficients for each value of lambda.

rho:

The unpenalized objective function for each value of lambda.

PenRho:

The penalized objective function for each value of lambda.

nzero:

The number of nonzero coefficients for each value of lambda.

tau:

Quantile of the model.

a:

Value of a for the penalized loss function.

Author(s)

Shaobo Li and Ben Sherwood, ben.sherwood@ku.edu

References

Wang M, Kang X, Liang J, Wang K, Wu Y (2024). “Heteroscedasticity identification and variable selection via multiple quantile regression.” Journal of Statistical Computation and Simulation, 94(2), 297-314.

Sherwood B, Li S (2022). “Quantile regression feature selection and estimation with grouped variables using Huber approximation.” Statistics and Computing, 32(5), 75.

Examples

## Not run:  
n<- 200
p<- 10
X<- matrix(rnorm(n*p),n,p)
y<- -2+X[,1]+0.5*X[,2]-X[,3]-0.5*X[,7]+X[,8]-0.2*X[,9]+rt(n,2)
taus <- seq(0.1, 0.9, 0.2)
fit<- rq.gq.pen(X, y, taus)
#use IC to select best model, see rq.gq.pen.cv() for a cross-validation approach
qfit <- qic.select(fit)

## End(Not run)

[Package rqPen version 4.1.1 Index]