rq.gq.pen {rqPen} | R Documentation |
Title Quantile regression estimation and consistent variable selection across multiple quantiles
Description
Uses the group lasso penalty across the quantiles to provide consistent selection across all, K, modeled quantiles. Let \beta^q
be the coefficients for the kth quantiles, \beta_j
be the Q-dimensional vector of the jth coefficient for each quantile, and
\rho_\tau(u)
is the quantile loss function. The method minimizes
\sum_{q=1}^Q \frac{1}{n} \sum_{i=1}^n m_i \rho_\tau(y_i-x_i^\top\beta^q) + \lambda \sum_{j=1}^p ||\beta_j||_{2,w} .
Uses a Huber approximation in the fitting of model, as presented in Sherwood and Li (2022). Where,
||\beta_j||_{2,w} = \sqrt{\sum_{k=1}^K w_kv_j\beta_{kj}^2},
where w_k
is a quantile weight
that can be specified by tau.penalty.factor
, v_j
is a predictor weight that can be assigned by penalty.factor
,
and m_i
is an observation weight that can be set by weights
.
Usage
rq.gq.pen(
x,
y,
tau,
lambda = NULL,
nlambda = 100,
eps = ifelse(nrow(x) < ncol(x), 0.01, 0.001),
weights = NULL,
penalty.factor = NULL,
scalex = TRUE,
tau.penalty.factor = NULL,
gmma = 0.2,
max.iter = 200,
lambda.discard = TRUE,
converge.eps = 1e-04,
beta0 = NULL
)
Arguments
x |
covariate matrix |
y |
a univariate response variable |
tau |
a sequence of quantiles to be modeled, must be of at least length 3. |
lambda |
shrinkage parameter. Default is NULL, and the algorithm provides a solution path. |
nlambda |
Number of lambda values to be considered. |
eps |
If not pre-specified the lambda vector will be from lambda_max to lambda_max times eps |
weights |
observation weights. Default is NULL, which means equal weights. |
penalty.factor |
weights for the shrinkage parameter for each covariate. Default is equal weight. |
scalex |
Whether x should be scaled before fitting the model. Coefficients are returned on the original scale. |
tau.penalty.factor |
weights for different quantiles. Default is equal weight. |
gmma |
tuning parameter for the Huber loss |
max.iter |
maximum number of iteration. Default is 200. |
lambda.discard |
Default is TRUE, meaning that the solution path stops if the relative deviance changes sufficiently small. It usually happens near the end of solution path. However, the program returns at least 70 models along the solution path. |
converge.eps |
The epsilon level convergence. Default is 1e-4. |
beta0 |
Initial estimates. Default is NULL, and the algorithm starts with the intercepts being the quantiles of response variable and other coefficients being zeros. |
Value
An rq.pen.seq object.
- models:
A list of each model fit for each tau and a combination.
- n:
Sample size.
- p:
Number of predictors.
- alg:
Algorithm used. Options are "huber" or any method implemented in rq(), such as "br".
- tau:
Quantiles modeled.
- a:
Tuning parameters a used.
- modelsInfo:
Information about the quantile and a value for each model.
- lambda:
Lambda values used for all models. If a model has fewer coefficients than lambda, say k. Then it used the first k values of lambda. Setting lambda.discard to TRUE will gurantee all values use the same lambdas, but may increase computational time noticeably and for little gain.
- penalty:
Penalty used.
- call:
Original call.
Each model in the models list has the following values.
- coefficients:
Coefficients for each value of lambda.
- rho:
The unpenalized objective function for each value of lambda.
- PenRho:
The penalized objective function for each value of lambda.
- nzero:
The number of nonzero coefficients for each value of lambda.
- tau:
Quantile of the model.
- a:
Value of a for the penalized loss function.
Author(s)
Shaobo Li and Ben Sherwood, ben.sherwood@ku.edu
References
Wang M, Kang X, Liang J, Wang K, Wu Y (2024). “Heteroscedasticity identification and variable selection via multiple quantile regression.” Journal of Statistical Computation and Simulation, 94(2), 297-314.
Sherwood B, Li S (2022). “Quantile regression feature selection and estimation with grouped variables using Huber approximation.” Statistics and Computing, 32(5), 75.
Examples
## Not run:
n<- 200
p<- 10
X<- matrix(rnorm(n*p),n,p)
y<- -2+X[,1]+0.5*X[,2]-X[,3]-0.5*X[,7]+X[,8]-0.2*X[,9]+rt(n,2)
taus <- seq(0.1, 0.9, 0.2)
fit<- rq.gq.pen(X, y, taus)
#use IC to select best model, see rq.gq.pen.cv() for a cross-validation approach
qfit <- qic.select(fit)
## End(Not run)