hqreg {hqreg} | R Documentation |
Fit a robust regression model with Huber or quantile loss penalized by lasso or elasti-net
Description
Fit solution paths for Huber loss regression or quantile regression penalized by lasso or elastic-net over a grid of values for the regularization parameter lambda.
Usage
hqreg(X, y, method = c("huber", "quantile", "ls"),
gamma = IQR(y)/10, tau = 0.5, alpha = 1, nlambda = 100, lambda.min = 0.05, lambda,
preprocess = c("standardize", "rescale"), screen = c("ASR", "SR", "none"),
max.iter = 10000, eps = 1e-7, dfmax = ncol(X)+1, penalty.factor = rep(1, ncol(X)),
message = FALSE)
Arguments
X |
Input matrix. |
y |
Response vector. |
method |
The loss function to be used in the model. Either "huber" (default),
"quantile", or "ls" for least squares (see |
gamma |
The tuning parameter of Huber loss, with no effect for the other loss functions. Huber loss is quadratic for absolute values less than gamma and linear for those greater than gamma. The default value is IQR(y)/10. |
tau |
The tuning parameter of the quantile loss, with no effect for the other loss functions. It represents the conditional quantile of the response to be estimated, so must be a number between 0 and 1. It includes the absolute loss when tau = 0.5 (default). |
alpha |
The elastic-net mixing parameter that controls the relative contribution
from the lasso and the ridge penalty. It must be a number between 0 and 1. |
nlambda |
The number of lambda values. Default is 100. |
lambda.min |
The smallest value for lambda, as a fraction of lambda.max, the data derived entry value. Default is 0.05. |
lambda |
A user-specified sequence of lambda values. Typical usage is to leave
blank and have the program automatically compute a |
preprocess |
Preprocessing technique to be applied to the input. Either
"standardize" (default) or "rescale"(see |
screen |
Screening rule to be applied at each |
max.iter |
Maximum number of iterations. Default is 10000. |
eps |
Convergence threshold. The algorithms continue until the maximum change in the
objective after any coefficient update is less than |
dfmax |
Upper bound for the number of nonzero coefficients. The algorithm exits and
returns a partial path if |
penalty.factor |
A numeric vector of length equal to the number of variables. Each
component multiplies |
message |
If set to TRUE, hqreg will inform the user of its progress. This argument is kept for debugging. Default is FALSE. |
Details
The sequence of models indexed by the regularization parameter lambda
is fit
using a semismooth Newton coordinate descent algorithm. The objective function is defined
to be
\frac{1}{n} \sum loss_i + \lambda\textrm{penalty}.
For method = "huber"
,
loss(t) = \frac{t^2}{2\gamma} I(|t|\le \gamma) + (|t| - \frac{\gamma}{2};) I(|t|>
\gamma)
for method = "quantile"
,
loss(t) = t (\tau - I(t<0));
for method = "ls"
,
loss(t) = \frac{t^2}{2}
In the model, "t" is replaced by residuals.
The program supports different types of preprocessing techniques. They are applied to
each column of the input matrix X
. Let x be a column of X
. For
preprocess = "standardize"
, the formula is
x' = \frac{x-mean(x)}{sd(x)};
for preprocess = "rescale"
,
x' = \frac{x-min(x)}{max(x)-min(x)}.
The models are fit with preprocessed input, then the coefficients are transformed back
to the original scale via some algebra. To fit a model for raw data with no preprocessing, use hqreg_raw
.
Value
The function returns an object of S3 class "hqreg"
, which is a list containing:
call |
The call that produced this object. |
beta |
The fitted matrix of coefficients. The number of rows is equal to the number
of coefficients, and the number of columns is equal to |
iter |
A vector of length |
saturated |
A logical flag for whether the number of nonzero coefficients has reached |
lambda |
The sequence of regularization parameter values in the path. |
alpha |
Same as above. |
gamma |
Same as above. |
tau |
Same as above. |
penalty.factor |
Same as above. |
method |
Same as above. |
nv |
The variable screening rules are accompanied with checks of optimality
conditions. When violations occur, the program adds in violating variables and re-runs
the inner loop until convergence. |
Author(s)
Congrui Yi <congrui-yi@uiowa.edu>
References
Yi, C. and Huang, J. (2016)
Semismooth Newton Coordinate Descent Algorithm for
Elastic-Net Penalized Huber Loss Regression and Quantile Regression,
https://arxiv.org/abs/1509.02957
Journal of Computational and Graphical Statistics, accepted in Nov 2016
http://www.tandfonline.com/doi/full/10.1080/10618600.2016.1256816
See Also
Examples
X = matrix(rnorm(1000*100), 1000, 100)
beta = rnorm(10)
eps = 4*rnorm(1000)
y = drop(X[,1:10] %*% beta + eps)
# Huber loss
fit1 = hqreg(X, y)
coef(fit1, 0.01)
predict(fit1, X[1:5,], lambda = c(0.02, 0.01))
# Quantile loss
fit2 = hqreg(X, y, method = "quantile", tau = 0.2)
plot(fit2)
# Squared loss
fit3 = hqreg(X, y, method = "ls", preprocess = "rescale")
plot(fit3, xvar = "norm")