cv.sglfit {midasml} | R Documentation |
Cross-validation fit for sg-LASSO
Description
Does k-fold cross-validation for sg-LASSO regression model.
The function runs sglfit nfolds+1
times; the first to get the path solution in λ sequence, the rest to compute the fit with each of the folds omitted.
The average error and standard deviation over the folds is computed, and the optimal regression coefficients are returned for lam.min
and lam.1se
. Solutions are computed for a fixed γ
Usage
cv.sglfit(x, y, lambda = NULL, gamma = 1.0, gindex = 1:p,
nfolds = 10, foldid, parallel = FALSE, ...)
Arguments
x |
T by p data matrix, where T and p respectively denote the sample size and the number of regressors. |
y |
T by 1 response variable. |
lambda |
a user-supplied lambda sequence. By leaving this option unspecified (recommended), users can have the program compute its own λ sequence based on |
gamma |
sg-LASSO mixing parameter. γ = 1 gives LASSO solution and γ = 0 gives group LASSO solution. |
gindex |
p by 1 vector indicating group membership of each covariate. |
nfolds |
number of folds of the cv loop. Default set to |
foldid |
the fold assignments used. |
parallel |
if |
... |
Other arguments that can be passed to sglfit. |
Details
The cross-validation is run for sg-LASSO linear model. The sequence of linear regression models implied by λ vector is fit by block coordinate-descent. The objective function is||y - ια - xβ||2T + 2λ Ωγ(β),
where ι∈RTenter> and ||u||2T=<u,u>/T is the empirical inner product. The penalty function Ωγ(.) is applied on β coefficients and is
Ωγ(β) = γ |β|1 + (1-γ)|β|2,1,
a convex combination of LASSO and group LASSO penalty functions.
Value
cv.sglfit object.
Author(s)
Jonas Striaukas
Examples
set.seed(1)
x = matrix(rnorm(100 * 20), 100, 20)
beta = c(5,4,3,2,1,rep(0, times = 15))
y = x%*%beta + rnorm(100)
gindex = sort(rep(1:4,times=5))
cv.sglfit(x = x, y = y, gindex = gindex, gamma = 0.5,
standardize = FALSE, intercept = FALSE)
## Not run:
# Parallel
require(doMC)
registerDoMC(cores = 2)
x = matrix(rnorm(1000 * 20), 1000, 20)
beta = c(5,4,3,2,1,rep(0, times = 15))
y = x%*%beta + rnorm(1000)
gindex = sort(rep(1:4,times=5))
system.time(cv.sglfit(x = x, y = y, gindex = gindex, gamma = 0.5,
standardize = FALSE, intercept = FALSE))
system.time(cv.sglfit(x = x, y = y, gindex = gindex, gamma = 0.5,
standardize = FALSE, intercept = FALSE, parallel = TRUE))
## End(Not run)