R: k-folds cross-validation for 'emBayes'

cv.emBayes {emBayes}

R Documentation

k-folds cross-validation for 'emBayes'

Description

This function performs cross-validation and returns the optimal values of the tuning parameters.

Usage

cv.emBayes(
  y,
  clin = NULL,
  X,
  quant,
  t0,
  t1,
  k,
  func,
  error = 0.01,
  maxiter = 100
)

Arguments

`y`	a vector of response variable.
`clin`	a matrix of clinical factors. It has default value NULL.
`X`	a matrix of genetic factors.
`quant`	value of quantile.
`t0`	a user-supplied sequence of the spike scale `s_{0}`.
`t1`	a user-supplied sequence of the slab scale `s_{1}`.
`k`	number of folds for cross-validation.
`func`	methods to perform variable selection. Two choices are available: "ssLASSO" and "ssQLASSO".
`error`	cutoff value for determining convergence. The algorithm reaches convergence if the difference in the expected log-likelihood of two iterations is less than the value of error. The default value is 0.01.
`maxiter`	the maximum number of iterations that is used in the estimation algorithm. The default value is 200.

Details

When performing cross-validation for emBayes, function cv.emBayes returns two sets of optimal tuning parameters and their corresponding cross-validation error matrices. The spike scale parameter CL.s0 and the slab scale parameter CL.s1 are obtained based on the quantile check loss. The spike scale parameter SL.s0 and the slab scale parameter SL.s1 are obtained based on the least squares loss. The spike scale parameter SIC.s0 and the slab scale parameter SIC.s1 are obtained based on the Schwarz Information Criterion (SIC). Corresponding error matrices CL.CV, SL.CV and SIC.CV can also be obtained from the output.

Schwarz Information Criterion has the following form:

SIC=\log\sum_{i=1}^nL(y_i-\hat{y_i})+\frac{\log n}{2n}edf

where L(\cdot) is the check loss and edf is the number of close to zero residuals (\leq 0.001). For non-robust method “ssLASSO”, one should use least squares loss for tuning selection. For robust method “ssQLASSO”, one can either use quantile check loss or SIC for tuning selection. We suggest using SIC, since it has been extensively utilized for tuning selection in high-dimensional quantile regression, as documented in numerous literature sources.

Value

A list with components:

`CL.s0`	the optimal spike scale under check loss.
`CL.s1`	the optimal slab scale under check loss.
`SL.s0`	the optimal slab scale under least squares loss.
`SL.s1`	the optimal slab scale under least squares loss.
`SIC.s0`	the optimal slab scale under SIC.
`SIC.s1`	the optimal slab scale under SIC.
`CL.CV`	cross-validation error matrix under check loss.
`SL.CV`	cross-validation error matrix under least squares loss.
`SIC.CV`	cross-validation error matrix under SIC.

[Package emBayes version 0.1.5 Index]