cv.OHPL {OHPL} | R Documentation |
Cross-Validation for Ordered Homogeneity Pursuit Lasso
Description
Use cross-validation to help select the optimal number of variable groups and the value of gamma.
Usage
cv.OHPL(X.cal, y.cal, maxcomp, gamma = seq(0.1, 0.9, 0.1), X.test,
y.test, cv.folds = 5L, G = 30L, type = c("max", "median"),
scale = TRUE, pls.method = "simpls")
Arguments
X.cal |
Predictor matrix (training) |
y.cal |
Response matrix with one column (training) |
maxcomp |
Maximum number of components for PLS |
gamma |
A vector of the gamma sequence between (0, 1). |
X.test |
X.test Predictor matrix (test) |
y.test |
y.test Response matrix with one column (test) |
cv.folds |
Number of cross-validation folds |
G |
Maximum number of variable groups |
type |
Find the maximum absolute correlation ( |
scale |
Should the predictor matrix be scaled?
Default is |
pls.method |
Method for fitting the PLS model.
Default is |
Value
A list containing the optimal model, RMSEP, Q2, and other evaluation metrics. Also the optimal number of groups to use in group lasso.
Examples
data("wheat")
X <- wheat$x
y <- wheat$protein
n <- nrow(wheat$x)
set.seed(1001)
samp.idx <- sample(1L:n, round(n * 0.7))
X.cal <- X[samp.idx, ]
y.cal <- y[samp.idx]
X.test <- X[-samp.idx, ]
y.test <- y[-samp.idx]
# this could run a while
## Not run:
cv.fit <- cv.OHPL(
x, y,
maxcomp = 6, gamma = seq(0.1, 0.9, 0.1),
x.test, y.test, cv.folds = 5, G = 30, type = "max"
)
# the optimal G and gamma
cv.fit$opt.G
cv.fit$opt.gamma
## End(Not run)