R: Does k-folds cross validation for rq.pen. If multiple values...

rq.pen.cv {rqPen}

R Documentation

Does k-folds cross validation for rq.pen. If multiple values of a are specified then does a grid based search for best value of `\lambda` and a.

Description

Does k-folds cross validation for rq.pen. If multiple values of a are specified then does a grid based search for best value of \lambda and a.

Usage

rq.pen.cv(
  x,
  y,
  tau = 0.5,
  lambda = NULL,
  penalty = c("LASSO", "Ridge", "ENet", "aLASSO", "SCAD", "MCP"),
  a = NULL,
  cvFunc = NULL,
  nfolds = 10,
  foldid = NULL,
  nlambda = 100,
  groupError = TRUE,
  cvSummary = mean,
  tauWeights = rep(1, length(tau)),
  printProgress = FALSE,
  weights = NULL,
  ...
)

Arguments

`x`	Matrix of predictors.
`y`	Vector of responses.
`tau`	Quantiles to be modeled.
`lambda`	Values of `\lambda`. Default will automatically select the `\lambda` values.
`penalty`	Choice of penalty between LASSO, Ridge, Elastic Net (ENet), Adaptive Lasso (aLASSO), SCAD and MCP.
`a`	Tuning parameter of a. LASSO and Ridge has no second tuning parameter, but for notation is set to 1 or 0 respectively, the values for elastic net. Defaults are Ridge ()
`cvFunc`	Loss function for cross-validation. Defaults to quantile loss, but user can specify their own function.
`nfolds`	Number of folds.
`foldid`	Ids for folds. If set will override nfolds.
`nlambda`	Number of lambda, ignored if lambda is set.
`groupError`	If set to false then reported error is the sum of all errors, not the sum of error for each fold.
`cvSummary`	Function to summarize the errors across the folds, default is mean. User can specify another function, such as median.
`tauWeights`	Weights for the different tau models. Only used in group tau results (gtr).
`printProgress`	If set to TRUE prints which partition is being worked on.
`weights`	Weights for the quantile loss objective function.
`...`	Additional arguments passed to rq.pen()

Details

Two cross validation results are returned. One that considers the best combination of a and lambda for each quantile. The second considers the best combination of the tuning parameters for all quantiles. Let y_{b,i}, x_{b,i}, and m_{b,i} index the response, predictors, and weights of observations in fold b. Let \hat{\beta}_{\tau,a,\lambda}^{-b} be the estimator for a given quantile and tuning parameters that did not use the bth fold. Let n_b be the number of observations in fold b. Then the cross validation error for fold b is

\mbox{CV}(b,\tau) = \frac{1}{n_b} \sum_{i=1}^{n_b} m_{b,i} \rho_\tau(y_{b,i}-x_{b,i}^\top\hat{\beta}_{\tau,a,\lambda}^{-b}).

Note that \rho_\tau() can be replaced by a different function by setting the cvFunc parameter. The function returns two different cross-validation summaries. The first is btr, by tau results. It provides the values of lambda and a that minimize the average, or whatever function is used for cvSummary, of \mbox{CV}(b). In addition it provides the sparsest solution that is within one standard error of the minimum results.

The other approach is the group tau results, gtr. Consider the case of estimating Q quantiles of \tau_1,\ldots,\tau_Q with quantile (tauWeights) of v_q. The gtr returns the values of lambda and a that minimizes the average, or again whatever function is used for cvSummary, of

\sum_{q=1}^Q v_q\mbox{CV}(b,\tau_q).

If only one quantile is modeled then the gtr results can be ignored as they provide the same minimum solution as btr.

Value

An rq.pen.seq.cv object.

cverr:: Matrix of cvSummary function, default is average, cross-validation error for each model, tau and a combination, and lambda.
cvse:: Matrix of the standard error of cverr foreach model, tau and a combination, and lambda.
fit:: The rq.pen.seq object fit to the full data.
btr:: A data.table of the values of a and lambda that are best as determined by the minimum cross validation error and the one standard error rule, which fixes a. In btr the values of lambda and a are selected seperately for each quantile.
gtr:: A data.table for the combination of a and lambda that minimize the cross validation error across all tau.
gcve:: Group, across all quantiles, cross-validation error results for each value of a and lambda.
call:: Original call to the function.

Author(s)

Ben Sherwood, ben.sherwood@ku.edu

Examples

## Not run: 
x <- matrix(runif(800),ncol=8)
y <- 1 + x[,1] + x[,8] + (1+.5*x[,3])*rnorm(100)
r1 <- rq.pen.cv(x,y) #lasso fit for median
# Elastic net fit for multiple values of a and tau
r2 <- rq.pen.cv(x,y,penalty="ENet",a=c(0,.5,1),tau=c(.25,.5,.75)) 
#same as above but more weight given to median when calculating group cross validation error. 
r3 <- rq.pen.cv(x,y,penalty="ENet",a=c(0,.5,1),tau=c(.25,.5,.75),tauWeights=c(.25,.5,.25))
# uses median cross-validation error instead of mean.
r4 <- rq.pen.cv(x,y,cvSummary=median)  
#Cross-validation with no penalty on the first variable.
r5 <- rq.pen.cv(x,y,penalty.factor=c(0,rep(1,7)))

## End(Not run)

[Package rqPen version 4.1.1 Index]

Does k-folds cross validation for rq.pen. If multiple values of a are specified then does a grid based search for best value of \lambda and a.