sox {sox} | R Documentation |
(Time-dependent) Cox model with structured variable selection
Description
Fit a (time-dependent) Cox model with overlapping (including nested) group lasso penalty. The regularization path is computed at a grid of values for the regularization parameter lambda.
Usage
sox(
x,
ID,
time,
time2,
event,
penalty,
lambda,
group,
group_variable,
own_variable,
no_own_variable,
penalty_weights,
par_init,
stepsize_init = 1,
stepsize_shrink = 0.8,
tol = 1e-05,
maxit = 1000L,
verbose = FALSE
)
Arguments
x |
Predictor matrix with dimension |
ID |
The ID of each subjects, each subject has one ID (multiple rows in |
time |
Represents the start of each time interval. |
time2 |
Represents the stop of each time interval. |
event |
Indicator of event. |
penalty |
Character string, indicating whether " |
lambda |
Sequence of regularization coefficients |
group |
A |
group_variable |
A |
own_variable |
A non-decreasing integer vector of length |
no_own_variable |
An integer vector of length |
penalty_weights |
Optional, vector of length |
par_init |
Optional, vector of initial values of the optimization algorithm. Default initial value is zero for all |
stepsize_init |
Initial value of the stepsize of the optimization algorithm. Default is 1.0. |
stepsize_shrink |
Factor in |
tol |
Convergence criterion. Algorithm stops when the |
maxit |
Maximum number of iterations allowed. |
verbose |
Logical, whether progress is printed. |
Details
The predictor matrix should be of dimension nm * p
. Each row records the values of covariates for one subject at one time, for example, the values at the day from time
(Start) to time2
(Stop). An example dataset sim
is provided. The dataset has the format produced by the R
package PermAlgo.
The specification of the arguments group
, group_variable
, own_variable
and no_own_variable
for the grouping structure can be found in https://thoth.inrialpes.fr/people/mairal/spams/doc-R/html/doc_spams006.html#sec26 and https://thoth.inrialpes.fr/people/mairal/spams/doc-R/html/doc_spams006.html#sec27.
In the Examples below, p=9,G=5
, the group structure is:
g_1 = \{A_{1}, A_{2}, A_{1}B, A_{2}B\},
g_2 = \{B, A_{1}B, A_{2}B, C_{1}B, C_{2}B\},
g_3 = \{A_{1}B, A_{2}B\},
g_4 = \{C_1, C_2, C_{1}B, C_{2}B\},
g_5 = \{C_{1}B, C_{2}B\}.
where g_3
is a subset of g_1
and g_2
, and g_5
is a subset of g_2
and g_4
.
Value
A list with the following three elements.
lambdas |
The user-specified regularization coefficients |
estimates |
A matrix, with each column corresponding to the coefficient estimates at each |
iterations |
A vector of number of iterations it takes to converge at each |
Examples
x <- as.matrix(sim[, c("A1","A2","C1","C2","B","A1B","A2B","C1B","C2B")])
lam.seq <- exp(seq(log(1e0), log(1e-3), length.out = 20))
# Variables:
## 1: A1
## 2: A2
## 3: C1
## 4: C2
## 5: B
## 6: A1B
## 7: A2B
## 8: C1B
## 9: C2B
# Overlapping groups:
## g1: A1, A2, A1B, A2B
## g2: B, A1B, A2B, C1B, C2B
## g3: A1B, A2B
## g4: C1, C2, C1B, C2B
## g5: C1B, C2B
overlapping.groups <- list(c(1, 2, 6, 7),
c(5, 6, 7, 8, 9),
c(6, 7),
c(3, 4, 8, 9),
c(8, 9))
pars.overlapping <- overlap_structure(overlapping.groups)
fit.overlapping <- sox(
x = x,
ID = sim$Id,
time = sim$Start,
time2 = sim$Stop,
event = sim$Event,
penalty = "overlapping",
lambda = lam.seq,
group = pars.overlapping$groups,
group_variable = pars.overlapping$groups_var,
penalty_weights = pars.overlapping$group_weights,
tol = 1e-4,
maxit = 1e3,
verbose = FALSE
)
str(fit.overlapping)
# Nested groups (misspecified, for the demonstration of the software only.)
## g1: A1, A2, C1, C2, B, A1B, A2B, C1B, C2B
## g2: A1B, A2B, A1B, A2B
## g3: C1, C2, C1B, C2B
## g4: 1
## g5: 2
## ...
## G12: 9
nested.groups <- list(1:9,
c(1, 2, 6, 7),
c(3, 4, 8, 9),
1, 2, 3, 4, 5, 6, 7, 8, 9)
pars.nested <- nested_structure(nested.groups)
fit.nested <- sox(
x = x,
ID = sim$Id,
time = sim$Start,
time2 = sim$Stop,
event = sim$Event,
penalty = "nested",
lambda = lam.seq,
group = pars.nested$groups,
own_variable = pars.nested$own_variables,
no_own_variable = pars.nested$N_own_variables,
penalty_weights = pars.nested$group_weights,
tol = 1e-4,
maxit = 1e3,
verbose = FALSE
)
str(fit.nested)