ahazisis {ahaz}R Documentation

Independent screening for the semiparametric additive hazards model

Description

Fast and scalable model selection for the semiparametric additive hazards model via univariate screening combined with penalized regression.

Usage

ahazisis(surv, X, weights, standardize=TRUE,
        nsis=floor(nobs/1.5/log(nobs)), do.isis=TRUE,
        maxloop=5, penalty=sscad.control(), tune=cv.control(),
        rank=c("FAST","coef","z","crit"))

Arguments

surv

Response in the form of a survival object, as returned by the function Surv() in the package survival. Right-censored and counting process format (left-truncation) is supported. Tied survival times are not supported.

X

Design matrix. Missing values are not supported.

weights

Optional vector of observation weights. Default is 1 for each observation.

standardize

Logical flag for variable standardization, prior to model fitting. Estimates are always returned on the original scale. Default is standardize=TRUE.

nsis

Number of covariates to recruit initially. If do.isis=TRUE, then this is also the maximal number of variables that the algorithm will recruit. Default is nsis=floor(nobs/log(nobs)/1.5)

.

do.isis

Perform iterated independent screening?

maxloop

Maximal number of iterations of the algorithm if do.isis=TRUE.

rank

Method to use for (re)recruitment of variables. See details.

penalty

A description of the penalty function to be used for the variable selection part. This can be a character string naming a penalty function (currently "lasso" or stepwise SCAD, "sscad") or a call to the penalty function. Default is penalty=sscad.control(). See ahazpen and ahazpen.pen.control for more options and examples.

tune

A description of the tuning method to be used for the variable selection part. This can be a character string naming a tuning control function (currently "cv" or "bic") or a call to the tuning control function. Default is tune=cv.control(). See ahaz.tune.control for options and examples.

Details

The function is a basic implementation of the iterated sure independent screening method described in Gorst-Rasmussen & Scheike (2011). Briefly, the algorithm does the following:

  1. Recruits the nsis most relevant covariates by ranking them according to the univariate ranking method described by rank.

  2. Selects, using ahazpen with penalty function described in penalty, a model among the top two thirds of the nsis most relevant covariates. Call the size of this model m.

  3. Recruits 'nsis minus m' new covariates among the non-selected covariates by ranking their relevance according to the univariate ranking method described in rank, adjusted for the already selected variables (using an unpenalized semiparametric additive hazards model).

Steps 2-3 are iterated for maxloop times, or until nsis covariates has been recruited, or until the set of selected covariate is stable between two iterations; whichever comes first.

The following choices of ranking method exist:

Value

An object with S3 class "ahazisis".

call

The call that produced this object.

initRANKorder

The initial ranking order.

detail.pickind

List (of length at most maxloop) listing the covariates selected in each recruitment step.

detail.ISISind

List (of length at most maxloop) listing the covariates selected in each variable selection step.

detail.ISIScoef

List (of length at most maxloop) listing the estimated penalized regression coefficients corresponding to the indices in detail.ISISind.

SISind

Indices of covariates selected in the initial recruitment step.

ISISind

Indices of the final set of covariates selected by the iterated algorithm.

ISIScoef

Vector of the penalized regression coefficients of the covariates in ISISind.

nsis

The argument nsis.

do.isis

The argument do.isis.

maxloop

The argument maxloop.

References

Gorst-Rasmussen, A. & Scheike, T. H. (2011). Independent screening for single-index hazard rate models with ultra-high dimensional features. Technical report R-2011-06, Department of Mathematical Sciences, Aalborg University.

See Also

print.ahazisis, ahazpen, ahaz.adjust

Examples

data(sorlie)

# Break ties
set.seed(10101)
time <- sorlie$time+runif(nrow(sorlie))*1e-2

# Survival data + covariates
surv <- Surv(time,sorlie$status)
X <- as.matrix(sorlie[,3:ncol(sorlie)])

# Basic ISIS/SIS with a single step
set.seed(10101)
m1 <- ahazisis(surv,X,maxloop=1,rank="coef")
m1
# Indices of the variables from the initial recruitment step
m1$SISind
# Indices of selected variables
m1$ISISind
# Check fit
score <- X[,m1$ISISind]%*%m1$ISIScoef
plot(survfit(surv~I(score>median(score))))


[Package ahaz version 1.14 Index]