cv.gam {gamreg} | R Documentation |
Robust Cross-Validation
Description
Compute Robust Cross-Validation for selecting best model.
Usage
cv.gam(X, Y, init.mode = c("sLTS", "RLARS", "RANSAC"),
lambda.mode = "lambda0", lmax = 1, lmin = 0.05, nlambda = 50,
fold = 10, ncores = 1, gam = 0.1, gam0 = 0.5, intercept = "TRUE",
alpha = 1, ini.subsamp = 0.2, ini.cand = 1000, alpha.LTS = 0.75,
nlambda.LTS = 40)
Arguments
X |
Predictor variables Matrix. |
Y |
Response variables Matrix. |
init.mode |
|
lambda.mode |
|
lmax |
When |
lmin |
When |
nlambda |
The number of grids for Robust Cross-Validation. |
fold |
the number of folds for K-fold Robust Cross-Validation. If |
ncores |
positive integer giving the number of processor cores to be used for parallel computing (the default is 1 for no parallelization). |
gam |
Robust tuning parameter of gamma-divergence for regression. |
gam0 |
tuning parameter of Robust Cross-Validation. |
intercept |
Should intercept be fitted |
alpha |
The elasticnet mixing parameter, with |
ini.subsamp |
The fraction of subsamples in " |
ini.cand |
The number of candidates for estimating itnial points in " |
alpha.LTS |
The fraction of subsamples for trimmed squares in " |
nlambda.LTS |
The number of grids for sparse tuning parameter in " |
Details
If the "RANSAC
" is used as the initial point, the parameter ini.subsamp
and ini.cand
can be determined carefully. The smaller ini.subsamp
is, the more robust initial point is. However, less efficiency.
Value
lambda |
A numeric vector giving the values of the penalty parameter. |
fit |
All results at each lambda. |
Rocv |
The result of best model by Robust Cross-Validation. |
Author(s)
Takayuki Kawashima
References
Kawashima, T. and Fujisawa, H. (2017).
Robust and Sparse Regression via gamma-divergence, Entropy, 19(11).
Fujisawa, H. and Eguchi, S. (2008).
Robust parameter estimation with a small bias against heavy contamination, Journal of Multivariate Analysis, 99(9), 2053-2081.
Examples
## generate data
library(mvtnorm)
n <- 30 # number of observations
p <- 10 # number of expalanatory variables
epsilon <- 0.1 # contamination ratio
beta0 <- 0.0 # intercept
beta <- c(numeric(p)) # regression coefficients
beta[1] <- 1
beta[2] <- 2
beta[3] <- 3
beta[4] <- 4
Sigma <- 0.2^t(sapply(1:p, function(i, j) abs(i-j), 1:p))
X <- rmvnorm(n, sigma=Sigma) # explanatory variables
e <- rnorm(n) # error terms
i <- 1:ceiling(epsilon*n) # index of outliers
e[i] <- e[i] + 20 # vertical outliers
Y <- beta0*(numeric(n)+1) + X%*%beta
res <- cv.gam(X,Y,nlambda = 5, nlambda.LTS=20 ,init.mode="sLTS")