cv.grpreg {grpreg} | R Documentation |
Cross-validation for grpreg/grpsurv
Description
Performs k-fold cross validation for penalized regression models with grouped covariates over a grid of values for the regularization parameter lambda.
Usage
cv.grpreg(X, y, group=1:ncol(X), ..., nfolds=10, seed, fold,
returnY=FALSE, trace=FALSE)
cv.grpsurv(X, y, group, ..., nfolds=10, seed, fold, se=c('quick',
'bootstrap'), returnY=FALSE, trace=FALSE)
Arguments
X |
The design matrix, as in |
y |
The response vector (or matrix), as in
|
group |
The grouping vector, as in |
... |
Additional arguments to |
nfolds |
The number of cross-validation folds. Default is 10. |
seed |
You may set the seed of the random number generator in order to obtain reproducible results. |
fold |
Which fold each observation belongs to. By default the observations are randomly assigned. |
returnY |
Should |
trace |
If set to TRUE, cv.grpreg will inform the user of its progress by announcing the beginning of each CV fold. Default is FALSE. |
se |
For |
Details
The function calls grpreg
/cv.grpsurv
nfolds
times, each time leaving out 1/nfolds
of the data. The
cross-validation error is based on the deviance;
see
here for more details.
For Gaussian and Poisson responses, the folds are chosen according to
simple random sampling. For binomial responses, the numbers for each
outcome class are balanced across the folds; i.e., the number of
outcomes in which y
is equal to 1 is the same for each fold, or
possibly off by 1 if the numbers do not divide evenly. This approach
is used for Cox regression as well to balance the amount of censoring
cross each fold.
For Cox models, cv.grpsurv
uses the approach of calculating
the full Cox partial likelihood using the cross-validated set of
linear predictors. Other approaches to cross-validation for the Cox
regression model have been proposed in the literature; the strengths
and weaknesses of the various methods for penalized regression in the
Cox model are the subject of current research. A simple approximation
to the standard error is provided, although an option to bootstrap the
standard error (se='bootstrap'
) is also available.
As in grpreg
, seemingly unrelated regressions/multitask
learning can be carried out by setting y
to be a matrix, in
which case groups are set up automatically (see grpreg
for details), and cross-validation is carried out with respect to rows
of y
. As mentioned in the details there, it is recommended to
standardize the responses prior to fitting.
Value
An object with S3 class "cv.grpreg"
containing:
cve |
The error for each value of |
cvse |
The estimated standard error associated with each value
of for |
lambda |
The sequence of regularization parameter values along which the cross-validation error was calculated. |
fit |
The fitted |
fold |
The fold assignments for cross-validation for each
observation; note that for |
min |
The index of |
lambda.min |
The value of |
null.dev |
The deviance for the intercept-only model. |
pe |
If |
Author(s)
Patrick Breheny
See Also
grpreg
, plot.cv.grpreg
,
summary.cv.grpreg
, predict.cv.grpreg
Examples
data(Birthwt)
X <- Birthwt$X
y <- Birthwt$bwt
group <- Birthwt$group
cvfit <- cv.grpreg(X, y, group)
plot(cvfit)
summary(cvfit)
coef(cvfit) ## Beta at minimum CVE
cvfit <- cv.grpreg(X, y, group, penalty="gel")
plot(cvfit)
summary(cvfit)