R: CoxICPen with two sets of covariates

CoxICPen.XZ {CoxICPen}

R Documentation

CoxICPen with two sets of covariates

Description

Perform variable selection for Cox regression model with two sets of covariates by using the method in Wu et al. (2020). Variable selection is performed on the possibly high-dimensional covariates x with linear effects. Covariates z with possibly nonlinear effects are always kept in the model.

Usage

CoxICPen.XZ(LR = LR,
            x = x,
            z = z,
            lamb = log(nrow(x))/2-2,
            beta.initial = rep(0,ncol(x)),
            pen = "BAR",
            nfold = 5,
            BernD = 3,
            subj.wt = rep(1,nrow(x)))

Arguments

`LR`	An n by 2 matrix that contains interval-censored failure times (L, R]. Please set time point R to "Inf" if a subject is right-censored.
`x`	An n by p covariate matrix. Variable selection will be performed on x. Linear covariates effects are assumed. Both p>n and p<n are allowed.
`z`	An n by q covariate matrix. Variable selection will NOT be performed on z. Non-linear covariates effects are assumed. Only q<n is allowed.
`lamb`	The value of the tuning parameter of the penalty term. Can either be a single value or a vector. Cross-validation will be employed to select the optimal lambda if a vector is provided. Default is log(n)/2-2.
`beta.initial`	The initial values for the regression coefficients in the Cox's model. Default is 0.
`pen`	The penalty function. Choices include "RIDGE", "BAR", "LASSO", "ALASSO", "SCAD", "MCP", "SICA", "SELO". Default is "BAR".
`nfold`	Number of folds for cross-validation. Will be ignored if a single lambda value is provided. Default is 5.
`BernD`	The degree of Bernstein polynomials for both cumulative baseline hazard and covariate effects of z. Default is 3.
`subj.wt`	Weight for each subject in the likelihood function. Can be used to incorporate case-cohort design. Default is 1 for each subject.

Value

beta: Penalized estimates of the regression coefficients in the Cox's model.

phi: Estimates of the coefficients in Bernstein Polynomials.

logL: Log likelihood function based on current parameter estimates and lambda value.

Lamb0: Estimate of the cumulative baseline hazard function at each observation time point.

cv.out: Cross-validation outcome for each lambda. Will be NULL if cross-validation is not performed.

f.est.all: A matrix that contains the values of covariates z and the corresponding estimated effects.

References

Wu, Q., Zhao, H., Zhu, L., Sun, J. (2020). Variable Selection for High-dimensional Partly Linear Additive Cox Model with Application to Alzheimer's disease. Statistics in Medicines.39(23):3120-3134.

Examples


# Generate an example data

require(foreach)

n <- 300  # Sample size
p <- 20   # Number of covariates

bet0 <- c(1, -1, 1, -1, rep(0,p-4))  # True values of regression coefficients
f1 <- function(z) sin(2*pi*z)  # True effects of z1
f2 <- function(z) cos(2*pi*z)  # True effects of z2
set.seed(1)
x.example <- matrix(rnorm(n*p,0,1),n,p)  # Generate x covariates matrix
z.example <- cbind(runif(n,0,1),runif(n,0,1))  # Generate z covariates matrix

T.example <- c()
for (i in 1:n){
  T.example[i] <- rexp(1,exp(x.example%*%bet0+
    f1(z.example[,1])+f2(z.example[,2]))[i])  # Generate true failure times
}

timep <- seq(0,3,,10)
LR.example <- c()
for (i in 1:n){
  obsT <- timep*rbinom(10,1,0.5)
  if (max(obsT) < T.example[i]) {LR.example <- rbind(LR.example,c(max(obsT), Inf))} else {
    LR.example <- rbind(LR.example,c(max(obsT[obsT<T.example[i]]), min(obsT[obsT>=T.example[i]])))
  }
}  # Generate interval-censored failure times


# Fit Cox's model with penalized estimation

model1 <- CoxICPen.XZ(LR = LR.example, x = x.example, z = z.example, lamb = 100, pen = "RIDGE")
beta.initial <- model1$beta

model2 <- CoxICPen.XZ(LR = LR.example, x = x.example, z = z.example, 
                      beta.initial = beta.initial, pen = "BAR")
model2$beta

# Plots of covariate effects of z

par(mfrow=c(1,2))

plot(model2$f.est.all$z1, model2$f.est.all$f1, type="l", ylim=c(-1,2),
     xlab="z1", ylab="f1")
lines(model2$f.est.all$z1, f1(model2$f.est.all$z1), col="blue")
legend("topright", col=c("black","blue"), lty=rep(1,2), c("Estimate", "True"))

plot(model2$f.est.all$z2, model2$f.est.all$f2, type="l", ylim=c(-1,2),
     xlab="z2", ylab="f2")
lines(model2$f.est.all$z2, f2(model2$f.est.all$z2), col="blue")
legend("topright", col=c("black","blue"), lty=rep(1,2), c("Estimate", "True"))