penAFT.cv {penAFT} | R Documentation |
Cross-validation function for fitting a regularized semiparametric accelerated failure time model
Description
A function to perform cross-validation and compute the solution path for the regularized semiparametric accelerated failure time model estimator.
Usage
penAFT.cv(X, logY, delta, nlambda = 50,
lambda.ratio.min = 0.1, lambda = NULL,
penalty = NULL, alpha = 1,weight.set = NULL,
groups = NULL, tol.abs = 1e-8, tol.rel = 2.5e-4,
standardize = TRUE, nfolds = 5, cv.index = NULL,
admm.max.iter = 1e4,quiet = TRUE)
Arguments
X |
An |
logY |
An |
delta |
An |
nlambda |
The number of candidate tuning parameters to consider. |
lambda.ratio.min |
The ratio of maximum to minimum candidate tuning parameter value. As a default, we suggest 0.1, but standard model selection procedures should be applied to select |
lambda |
An optional (not recommended) prespecified vector of candidate tuning parameters. Should be in descending order. |
penalty |
Either "EN" or "SG" for elastic net or sparse group lasso penalties. |
alpha |
The tuning parameter |
weight.set |
A list of weights. For both penalties, |
groups |
When using penalty "SG", a |
tol.abs |
Absolute convergence tolerance. |
tol.rel |
Relative convergence tolerance. |
standardize |
Should predictors be standardized (i.e., scaled to have unit variance) for model fitting? |
nfolds |
The number of folds to be used for cross-validation. Default is five. Ten is recommended when sample size is especially small. |
cv.index |
A list of length |
admm.max.iter |
Maximum number of ADMM iterations. |
quiet |
|
Details
Given where for subject
(
),
is the minimum of the survival time and censoring time,
is a
-dimensional predictor, and
is the indicator of censoring,
penAFT.cv
performs nfolds
cross-validation for selecting the tuning parameter to be used in the argument minimizing
where ,
, and
is either the weighted elastic net penalty (
penalty = "EN"
) or weighted sparse group lasso penalty (penalty = "SG"
).
The weighted elastic net penalty is defined as
where is a set of non-negative weights (which can be specified in the
weight.set
argument). The weighted sparse group-lasso penalty we consider is
where again, is a set of non-negative weights and
are weights applied to each of the
groups.
Next, we define the cross-validation errors.
Let be a random
nfolds
= element partition of
(the subjects) with the cardinality of each
(the "kth fold"") approximately equal for
.
Let
be the solution with tuning parameter
using only data indexed by
(i.e., outside the kth fold). Then, definining
for
, we call
the cross-validated Gehan loss at in the
th fold, and refer to the sum over all
nfolds
= folds as the cross-validated Gehan loss.
Similarly, letting
letting
for each ,
we call
the cross-validated linear predictor score at .
Value
full.fit |
A model fit with the same output as a model fit using |
cv.err.linPred |
A |
cv.err.obj |
A |
cv.index |
A list of length |
Examples
# --------------------------------------
# Generate data
# --------------------------------------
set.seed(1)
genData <- genSurvData(n = 50, p = 50, s = 10, mag = 2, cens.quant = 0.6)
X <- genData$X
logY <- genData$logY
delta <- genData$status
p <- dim(X)[2]
# -----------------------------------------------
# Fit elastic net penalized estimator
# -----------------------------------------------
fit.en <- penAFT.cv(X = X, logY = logY, delta = delta,
nlambda = 10, lambda.ratio.min = 0.1,
penalty = "EN", nfolds = 5,
alpha = 1)
# ---- coefficients at tuning parameter minimizing cross-valdiation error
coef.en <- penAFT.coef(fit.en)
# ---- predict at 8th tuning parameter from full fit
Xnew <- matrix(rnorm(10*p), nrow=10)
predict.en <- penAFT.predict(fit.en, Xnew = Xnew, lambda = fit.en$full.fit$lambda[8])
# -----------------------------------------------
# Fit sparse group penalized estimator
# -----------------------------------------------
groups <- rep(1:5, each = 10)
fit.sg <- penAFT.cv(X = X, logY = logY, delta = delta,
nlambda = 50, lambda.ratio.min = 0.01,
penalty = "SG", groups = groups, nfolds = 5,
alpha = 0.5)
# -----------------------------------------------
# Pass fold indices
# -----------------------------------------------
groups <- rep(1:5, each = 10)
cv.index <- list()
for(k in 1:5){
cv.index[[k]] <- which(rep(1:5, length=50) == k)
}
fit.sg.cvIndex <- penAFT.cv(X = X, logY = logY, delta = delta,
nlambda = 50, lambda.ratio.min = 0.01,
penalty = "SG", groups = groups,
cv.index = cv.index,
alpha = 0.5)
# --- compare cv indices
## Not run: fit.sg.cvIndex$cv.index == cv.index