R: Independent screening for the semiparametric additive hazards...

ahazisis {ahaz}

R Documentation

Independent screening for the semiparametric additive hazards model

Description

Fast and scalable model selection for the semiparametric additive hazards model via univariate screening combined with penalized regression.

Usage

ahazisis(surv, X, weights, standardize=TRUE,
        nsis=floor(nobs/1.5/log(nobs)), do.isis=TRUE,
        maxloop=5, penalty=sscad.control(), tune=cv.control(),
        rank=c("FAST","coef","z","crit"))

Arguments

`surv`	Response in the form of a survival object, as returned by the function `Surv()` in the package survival. Right-censored and counting process format (left-truncation) is supported. Tied survival times are not supported.
`X`	Design matrix. Missing values are not supported.
`weights`	Optional vector of observation weights. Default is 1 for each observation.
`standardize`	Logical flag for variable standardization, prior to model fitting. Estimates are always returned on the original scale. Default is `standardize=TRUE`.
`nsis`	Number of covariates to recruit initially. If `do.isis=TRUE`, then this is also the maximal number of variables that the algorithm will recruit. Default is `nsis=floor(nobs/log(nobs)/1.5)`

`do.isis`	Perform iterated independent screening?
`maxloop`	Maximal number of iterations of the algorithm if `do.isis=TRUE`.
`rank`	Method to use for (re)recruitment of variables. See details.
`penalty`	A description of the penalty function to be used for the variable selection part. This can be a character string naming a penalty function (currently `"lasso"` or stepwise SCAD, `"sscad"`) or a call to the penalty function. Default is `penalty=sscad.control()`. See `ahazpen` and `ahazpen.pen.control` for more options and examples.
`tune`	A description of the tuning method to be used for the variable selection part. This can be a character string naming a tuning control function (currently `"cv"` or `"bic"`) or a call to the tuning control function. Default is `tune=cv.control()`. See `ahaz.tune.control` for options and examples.

Details

The function is a basic implementation of the iterated sure independent screening method described in Gorst-Rasmussen & Scheike (2011). Briefly, the algorithm does the following:

Recruits the nsis most relevant covariates by ranking them according to the univariate ranking method described by rank.
Selects, using ahazpen with penalty function described in penalty, a model among the top two thirds of the nsis most relevant covariates. Call the size of this model m.
Recruits 'nsis minus m' new covariates among the non-selected covariates by ranking their relevance according to the univariate ranking method described in rank, adjusted for the already selected variables (using an unpenalized semiparametric additive hazards model).

Steps 2-3 are iterated for maxloop times, or until nsis covariates has been recruited, or until the set of selected covariate is stable between two iterations; whichever comes first.

The following choices of ranking method exist:

rank="FAST" corresponds to ranking, in the initial recruitment step only, by the basic FAST- statistic described in Gorst-Rasmussen & Scheike (2011). If do.isis=TRUE then the algorithm sets rank="z" for subsequent rankings.
rank="coef" corresponds to ranking by absolute value of (univariate) regression coefficients, obtained via ahaz
rank="z" corresponds to ranking by the |Z|-statistic of the (univariate) regression coefficients, obtained via ahaz
rank="crit" corresponds to ranking by the size of the decrease in the (univariate) natural loss function used for estimation by ahaz.

Value

An object with S3 class "ahazisis".

`call`	The call that produced this object.
`initRANKorder`	The initial ranking order.
`detail.pickind`	List (of length at most `maxloop`) listing the covariates selected in each recruitment step.
`detail.ISISind`	List (of length at most `maxloop`) listing the covariates selected in each variable selection step.
`detail.ISIScoef`	List (of length at most `maxloop`) listing the estimated penalized regression coefficients corresponding to the indices in `detail.ISISind`.
`SISind`	Indices of covariates selected in the initial recruitment step.
`ISISind`	Indices of the final set of covariates selected by the iterated algorithm.
`ISIScoef`	Vector of the penalized regression coefficients of the covariates in `ISISind`.
`nsis`	The argument `nsis`.
`do.isis`	The argument `do.isis`.
`maxloop`	The argument `maxloop`.

References

Gorst-Rasmussen, A. & Scheike, T. H. (2011). Independent screening for single-index hazard rate models with ultra-high dimensional features. Technical report R-2011-06, Department of Mathematical Sciences, Aalborg University.

Examples

data(sorlie)

# Break ties
set.seed(10101)
time <- sorlie$time+runif(nrow(sorlie))*1e-2

# Survival data + covariates
surv <- Surv(time,sorlie$status)
X <- as.matrix(sorlie[,3:ncol(sorlie)])

# Basic ISIS/SIS with a single step
set.seed(10101)
m1 <- ahazisis(surv,X,maxloop=1,rank="coef")
m1
# Indices of the variables from the initial recruitment step
m1$SISind
# Indices of selected variables
m1$ISISind
# Check fit
score <- X[,m1$ISISind]%*%m1$ISIScoef
plot(survfit(surv~I(score>median(score))))

[Package ahaz version 1.15 Index]