ahazpen {ahaz} | R Documentation |
Fit penalized semiparametric additive hazards model
Description
Fit a semiparametric additive hazards model via penalized estimating equations using, for example, the lasso penalty. The complete regularization path is computed at a grid of values for the penalty parameter lambda via the method of cyclic coordinate descent.
Usage
ahazpen(surv, X, weights, standardize=TRUE, penalty=lasso.control(),
nlambda=100, dfmax=nvars, pmax=min(nvars, 2*dfmax),
lambda.minf=ifelse(nobs < nvars,0.05, 1e-4), lambda,
penalty.wgt=NULL, keep=NULL, control=list())
Arguments
surv |
Response in the form of a survival object, as returned by the
function |
X |
Design matrix. Missing values are not supported. |
weights |
Optional vector of observation weights. Default is 1 for each observation. |
standardize |
Logical flag for variable standardization, prior to
model fitting. Estimates are always returned on
the original scale. Default is |
penalty |
A description of the penalty function to be used for
model fitting. This can be a character string naming a penalty
function (currently |
nlambda |
The number of |
dfmax |
Limit the maximum number of variables in the
model. Unless a complete
regularization path is needed, it is highly
recommended to initially choose a relatively smaller value of
|
pmax |
Limit the maximum number of variables to ever be considered by the coordinate descent algorithm. |
lambda.minf |
Smallest value of |
lambda |
An optional user supplied sequence of penalty parameters. Typical usage
is to have the
program compute its own |
penalty.wgt |
A vector of nonnegative penalty weights for each
regression coefficient. This is a number that multiplies |
keep |
A vector of indices of variables which should always be included in
the model (no penalization). Equivalent to specifying a |
control |
A list of parameters for controlling the
model fitting algorithm. The list is passed to |
Details
Fits the sequence of models implied by the penalty function
penalty
, the sequence of penalty parameters lambda
by
using the very efficient method of cyclic coordinate descent.
For data sets with a very large number of covariates, it is recommended
to only calculate partial paths by specifying a smallish value of
dmax
.
The sequence lambda
is computed automatically by the algorithm
but can also be set (semi)manually by specifying nlambda
or
lambda
. The stability and efficiency of the algorithm is highly
dependent on the grid lambda
values being reasonably dense, and
lambda
(and nlambda
) should be specified accordingly. In
particular, it is not recommended to specify a single or a few lambda
values. Instead, a partial regularization path should be calculated and
the functions predict.ahazpen
or
coef.ahazpen
should be used to extract coefficient
estimates at specific lambda values.
Value
An object with S3 class "ahazpen"
.
call |
The call that produced this object |
beta |
An |
lambda |
The sequence of actual |
df |
The number of nonzero coefficients for each value of
|
nobs |
Number of observations. |
nvars |
Number of covariates. |
surv |
A copy of the argument |
npasses |
Total number of passes by the fitting algorithm over the data, for all lambda values. |
penalty.wgt |
The actually used |
penalty |
An object of class |
dfmax |
A copy of |
penalty |
A copy of |
References
Gorst-Rasmussen A., Scheike T. H. (2012). Coordinate Descent Methods for the Penalized Semiparametric Additive Hazards Model. Journal of Statistical Software, 47(9):1-17. https://www.jstatsoft.org/v47/i09/
Gorst-Rasmussen, A. & Scheike, T. H. (2011). Independent screening for single-index hazard rate models with ultra-high dimensional features. Technical report R-2011-06, Department of Mathematical Sciences, Aalborg University.
Leng, C. & Ma, S. (2007). Path consistent model selection in additive risk model via Lasso. Statistics in Medicine; 26:3753-3770.
Martinussen, T. & Scheike, T. H. (2008). Covariate selection for the semiparametric additive risk model. Scandinavian Journal of Statistics; 36:602-619.
Zou, H. & Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models, Annals of Statistics; 36:1509-1533.
See Also
print.ahazpen
, predict.ahazpen
,
coef.ahazpen
, plot.ahazpen
,
tune.ahazpen
.
Examples
data(sorlie)
# Break ties
set.seed(10101)
time <- sorlie$time+runif(nrow(sorlie))*1e-2
# Survival data + covariates
surv <- Surv(time,sorlie$status)
X <- as.matrix(sorlie[,3:ncol(sorlie)])
# Fit additive hazards regression model
fit1 <- ahazpen(surv, X,penalty="lasso", dfmax=30)
fit1
plot(fit1)
# Extend the grid to contain exactly 100 lambda values
lrange <- range(fit1$lambda)
fit2 <- ahazpen(surv, X,penalty="lasso", lambda.minf=lrange[1]/lrange[2])
plot(fit2)
# User-specified lambda sequence
lambda <- exp(seq(log(0.30), log(0.1), length = 100))
fit2 <- ahazpen(surv, X, penalty="lasso", lambda = lambda)
plot(fit2)
# Advanced usage - specify details of the penalty function
fit4 <- ahazpen(surv, X,penalty=sscad.control(nsteps=2))
fit4
fit5 <- ahazpen(surv, X,penalty=lasso.control(alpha=0.1))
plot(fit5)