R: Robust Cross-Validation

cv.gam {gamreg}

R Documentation

Robust Cross-Validation

Description

Compute Robust Cross-Validation for selecting best model.

Usage

  cv.gam(X, Y, init.mode = c("sLTS", "RLARS", "RANSAC"),
         lambda.mode = "lambda0", lmax = 1, lmin = 0.05, nlambda = 50,
         fold = 10, ncores = 1, gam = 0.1, gam0 = 0.5, intercept = "TRUE",
         alpha = 1, ini.subsamp = 0.2, ini.cand = 1000, alpha.LTS = 0.75,
         nlambda.LTS = 40)

Arguments

`X`	Predictor variables Matrix.
`Y`	Response variables Matrix.
`init.mode`	`"sLTS"`: a initial point is the estimate of sparse least trimmed squares. `"RLARS"`: a initial point is the estimate of Robust LARS. `"RANSAC"`: a initial point is the estimate of RANSAC algorithm.
`lambda.mode`	`"lambda0"`: Robust Cross-Validation uses grids on range [0.05lambda0,lambda0] with log scale, where lambda0 is an estimator of sparse tuning parameter which would shrink regression coefficients to zero.
`lmax`	When `lambda.mode` is not lambda0, upper bound of range of grids is lmax.
`lmin`	When `lambda.mode` is not lambda0, lower bound of range of grids is lmin.
`nlambda`	The number of grids for Robust Cross-Validation.
`fold`	the number of folds for K-fold Robust Cross-Validation. If `fold` equals to sample size, Robust Cross-Validation is leave-one-out method.
`ncores`	positive integer giving the number of processor cores to be used for parallel computing (the default is 1 for no parallelization).
`gam`	Robust tuning parameter of gamma-divergence for regression.
`gam0`	tuning parameter of Robust Cross-Validation.
`intercept`	Should intercept be fitted `TRUE` or set to zero `FALSE`
`alpha`	The elasticnet mixing parameter, with `0 \le \alpha \le 1`. `alpha=1` is the lasso penalty, and `alpha=0` the ridge penalty.
`ini.subsamp`	The fraction of subsamples in "`RANSAC`".
`ini.cand`	The number of candidates for estimating itnial points in "`RANSAC`".
`alpha.LTS`	The fraction of subsamples for trimmed squares in "`sLTS`".
`nlambda.LTS`	The number of grids for sparse tuning parameter in "`sLTS`".

Details

If the "RANSAC" is used as the initial point, the parameter ini.subsamp and ini.cand can be determined carefully. The smaller ini.subsamp is, the more robust initial point is. However, less efficiency.

Value

`lambda`	A numeric vector giving the values of the penalty parameter.
`fit`	All results at each lambda.
`Rocv`	The result of best model by Robust Cross-Validation.

Author(s)

Takayuki Kawashima

References

Kawashima, T. and Fujisawa, H. (2017). Robust and Sparse Regression via gamma-divergence, Entropy, 19(11).
Fujisawa, H. and Eguchi, S. (2008). Robust parameter estimation with a small bias against heavy contamination, Journal of Multivariate Analysis, 99(9), 2053-2081.

Examples

  ## generate data
  library(mvtnorm)
  n <- 30                      # number of observations
  p <- 10                      # number of expalanatory variables

  epsilon <- 0.1               # contamination ratio

  beta0 <- 0.0                 # intercept
  beta <- c(numeric(p))        # regression coefficients
  beta[1] <- 1
  beta[2] <- 2
  beta[3] <- 3
  beta[4] <- 4

  Sigma <- 0.2^t(sapply(1:p, function(i, j) abs(i-j), 1:p))
  X <- rmvnorm(n, sigma=Sigma) # explanatory variables
  e <- rnorm(n) # error terms

  i <- 1:ceiling(epsilon*n)    # index of outliers
  e[i] <- e[i] + 20            # vertical outliers
  Y <- beta0*(numeric(n)+1) + X%*%beta


  res <- cv.gam(X,Y,nlambda = 5, nlambda.LTS=20 ,init.mode="sLTS")

[Package gamreg version 0.3 Index]