ahazisis {ahaz} | R Documentation |
Independent screening for the semiparametric additive hazards model
Description
Fast and scalable model selection for the semiparametric additive hazards model via univariate screening combined with penalized regression.
Usage
ahazisis(surv, X, weights, standardize=TRUE,
nsis=floor(nobs/1.5/log(nobs)), do.isis=TRUE,
maxloop=5, penalty=sscad.control(), tune=cv.control(),
rank=c("FAST","coef","z","crit"))
Arguments
surv |
Response in the form of a survival object, as returned by the
function |
X |
Design matrix. Missing values are not supported. |
weights |
Optional vector of observation weights. Default is 1 for each observation. |
standardize |
Logical flag for variable standardization, prior to
model fitting. Estimates are always returned on
the original scale. Default is |
nsis |
Number of covariates to recruit initially. If
|
.
do.isis |
Perform iterated independent screening? |
maxloop |
Maximal number of iterations of the algorithm if |
rank |
Method to use for (re)recruitment of variables. See details. |
penalty |
A description of the penalty function to be used for
the variable selection part. This can be a character string naming a penalty
function (currently |
tune |
A description of the tuning method to be used for the
variable selection part. This can be
a character string naming a tuning control
function (currently |
Details
The function is a basic implementation of the iterated sure independent screening method described in Gorst-Rasmussen & Scheike (2011). Briefly, the algorithm does the following:
Recruits the
nsis
most relevant covariates by ranking them according to the univariate ranking method described byrank
.Selects, using
ahazpen
with penalty function described inpenalty
, a model among the top two thirds of thensis
most relevant covariates. Call the size of this modelm
.Recruits '
nsis
minusm
' new covariates among the non-selected covariates by ranking their relevance according to the univariate ranking method described inrank
, adjusted for the already selected variables (using an unpenalized semiparametric additive hazards model).
Steps 2-3 are iterated for maxloop
times, or until nsis
covariates has been recruited, or until the
set of selected covariate is stable between two iterations; whichever
comes first.
The following choices of ranking method exist:
-
rank="FAST"
corresponds to ranking, in the initial recruitment step only, by the basic FAST- statistic described in Gorst-Rasmussen & Scheike (2011). Ifdo.isis=TRUE
then the algorithm setsrank="z"
for subsequent rankings. -
rank="coef"
corresponds to ranking by absolute value of (univariate) regression coefficients, obtained viaahaz
-
rank="z"
corresponds to ranking by the|Z|
-statistic of the (univariate) regression coefficients, obtained viaahaz
-
rank="crit"
corresponds to ranking by the size of the decrease in the (univariate) natural loss function used for estimation byahaz
.
Value
An object with S3 class "ahazisis"
.
call |
The call that produced this object. |
initRANKorder |
The initial ranking order. |
detail.pickind |
List (of length at most |
detail.ISISind |
List (of length at most |
detail.ISIScoef |
List (of length at most |
SISind |
Indices of covariates selected in the initial recruitment step. |
ISISind |
Indices of the final set of covariates selected by the iterated algorithm. |
ISIScoef |
Vector of the penalized regression coefficients of the
covariates in |
nsis |
The argument |
do.isis |
The argument |
maxloop |
The argument |
References
Gorst-Rasmussen, A. & Scheike, T. H. (2011). Independent screening for single-index hazard rate models with ultra-high dimensional features. Technical report R-2011-06, Department of Mathematical Sciences, Aalborg University.
See Also
print.ahazisis
, ahazpen
, ahaz.adjust
Examples
data(sorlie)
# Break ties
set.seed(10101)
time <- sorlie$time+runif(nrow(sorlie))*1e-2
# Survival data + covariates
surv <- Surv(time,sorlie$status)
X <- as.matrix(sorlie[,3:ncol(sorlie)])
# Basic ISIS/SIS with a single step
set.seed(10101)
m1 <- ahazisis(surv,X,maxloop=1,rank="coef")
m1
# Indices of the variables from the initial recruitment step
m1$SISind
# Indices of selected variables
m1$ISISind
# Check fit
score <- X[,m1$ISISind]%*%m1$ISIScoef
plot(survfit(surv~I(score>median(score))))