R: Choice of penalty parameter in ahazpen

tune.ahazpen {ahaz}

R Documentation

Choice of penalty parameter in ahazpen

Description

Tuning of penalty parameters for the penalized semiparametric additive hazards model via cross-validation - or via non-stochastic procedures, akin to BIC for likelihood-based models.

Usage

tune.ahazpen(surv, X, weights, standardize=TRUE, penalty=lasso.control(),
             tune=cv.control(), dfmax=nvars, lambda, ...)

Arguments

`surv`	Response in the form of a survival object, as returned by the function `Surv()` in the package survival. Right-censored and counting process format (left-truncation) is supported. Tied survival times are not supported.
`X`	Design matrix. Missing values are not supported.
`weights`	Optional vector of observation weights. Default is 1 for each observation.
`standardize`	Logical flag for variable standardization, prior to model fitting. Parameter estimates are always returned on the original scale. Default is `standardize=TRUE`.
`penalty`	A description of the penalty function to be used for model fitting. This can be a character string naming a penalty function (currently `"lasso"` or stepwise SCAD, `"sscad"`) or it can be a call to the penalty function. Default is `penalty=lasso.control()`. See `ahazpen.pen.control` for the available penalty functions and advanced options; see also the examples.
`dfmax`	Limit the maximum number of covariates included in the model. Default is `nvars=nobs-1`. Unless a complete regularization path is needed, it is highly recommended to initially choose a relatively smaller value of `dfmax` to reduce computation time and memory usage.
`lambda`	An optional user supplied sequence of penalty parameters. Typical usage is to have the program compute its own `lambda` sequence based on `nlambda` and `lambda.min`.
`tune`	A description of the tuning method to be used. This can be a character string naming a tuning control function (currently `"cv"` or `"bic"`) or a call to the tuning control function. Default is 5-fold cross-validation, `tune=cv.control()`, see `ahaz.tune.control` for more options. See also the examples.
`...`	Additional arguments to be passed to `ahazpen`, see `ahazpen` for options.

Details

The function performs an initial penalized fit based on the penalty supplied in penalty to obtain a sequence of penalty parameters. Subsequently, it selects among these an optimal penalty parameter based on the tuning control function described in tune, see ahaz.tune.control.

Value

An object with S3 class "tune.ahazpen".

`call`	The call that produced this object.
`lambda`	The actual sequence of `lambda` values used.
`tunem`	The tuning score for each value of `lambda` (mean cross-validated error, if `tune=cv.control()`).
`tunesd`	Estimate of the cross-validated standard error, if `tune=cv.control()`.
`tunelo`	Lower curve = `tunem-tunemsd`, if `tune=cv.control()`.
`tuneup`	Upper curve = `tunem+tunemsd`, if `tune=cv.control()`.
`lambda.min`	Value of `lambda` for which `tunem` is minimized.
`df`	Number of non-zero coefficients at each value of `lambda`.
`tune`	The selected `tune` of S3 class `"ahaz.tune.control"`.
`penalty`	The selected `penalty` of S3 class `"ahazpen.pen.control"`.
`foldsused`	Folds actually used, if `tune=cv.control()`.

References

Gorst-Rasmussen, A. & Scheike, T. H. (2011). Independent screening for single-index hazard rate models with ultra-high dimensional features. Technical report R-2011-06, Department of Mathematical Sciences, Aalborg University.

Examples

data(sorlie)

# Break ties
set.seed(10101)
time <- sorlie$time+runif(nrow(sorlie))*1e-2

# Survival data + covariates
surv <- Surv(time,sorlie$status)
X <- as.matrix(sorlie[,3:ncol(sorlie)])

# Training/test data
set.seed(20202)
train <- sample(1:nrow(sorlie),76)
test <- setdiff(1:nrow(sorlie),train)

# Run cross validation on training data
set.seed(10101)
cv.las <- tune.ahazpen(surv[train,], X[train,],dfmax=30)
plot(cv.las)

# Check fit on the test data
testrisk <- predict(cv.las,X[test,],type="lp")
plot(survfit(surv[test,]~I(testrisk<median(testrisk))),main="Low versus high risk")

# Advanced example, cross-validation of one-step SCAD
# with initial solution derived from univariate models.
# Since init.sol is specified as a function, it is
# automatically cross-validated as well
scadfun<-function(surv,X,weights){coef(ahaz(surv,X,univariate=TRUE))}
set.seed(10101)
cv.ssc<-tune.ahazpen(surv[train,],X[train,],
                     penalty=sscad.control(init.sol=scadfun),
                     tune=cv.control(rep=5),dfmax=30)
# Check fit on test data
testrisk <- predict(cv.ssc,X[test,],type="lp")
plot(survfit(surv[test,]~I(testrisk<median(testrisk))),main="Low versus high risk")

[Package ahaz version 1.15 Index]