cforward {cforward} | R Documentation |
Forward Selection Based on C-Index/Concordance
Description
Forward Selection Based on C-Index/Concordance
Usage
cforward(
data,
event_time = "event_time_years",
event_status = "mortstat",
weight_column = "WTMEC4YR_norm",
variables = NULL,
included_variables = NULL,
n_folds = 10,
seed = 1989,
max_model_size = 50,
c_threshold = NULL,
verbose = TRUE,
cfit_args = list(),
save_memory = FALSE,
...
)
cforward_one(
data,
event_time = "event_time_years",
event_status = "mortstat",
weight_column = "WTMEC4YR_norm",
variables,
included_variables = NULL,
verbose = TRUE,
cfit_args = list(),
save_memory = FALSE,
...
)
make_folds(data, event_status = "mortstat", n_folds = 10, verbose = TRUE)
Arguments
data |
A data set to perform model selection and cross-validation. |
event_time |
Character vector of length 1 with event times, passed to
|
event_status |
Character vector of length 1 with event status, passed to
|
weight_column |
Character vector of length 1 with weights for
model. If no weights are available, set to |
variables |
Character vector of variables to perform selection.
Must be in |
included_variables |
Character vector of variables
forced to have in the model. Must be in |
n_folds |
Number of folds for Cross-validation. If you want to run on the full data, set to 1 |
seed |
Seed set before folds are created. |
max_model_size |
maximum number of variables in the model. Selection will stop if reached. Note, this does not correspond to the number of coefficients, due to categorical variables. |
c_threshold |
threshold for concordance. If the difference in the best concordance and this one does not reach a certain threshold, break. |
verbose |
print diagnostic messages |
cfit_args |
Arguments passed to |
save_memory |
save only a minimal amount of information, discard the fitted models |
... |
Additional arguments to pass to |
Value
A list of lists, with elements of:
- full_concordance
Concordance when fit on the full data
- models
Cox model from full data set fit, stripped of large memory elements
- cv_concordance
Cross-validated Concordance
- included_variables
Variables included in the model, other than those being selection upon
Examples
variables = c("gender",
"age_years_interview", "education_adult")
res = cforward(nhanes_example,
event_time = "event_time_years",
event_status = "mortstat",
weight_column = "WTMEC4YR_norm",
variables = variables,
included_variables = NULL,
n_folds = 5,
c_threshold = 0.02,
seed = 1989,
max_model_size = 50,
verbose = TRUE)
conc = sapply(res, `[[`, "best_concordance")
res = cforward(nhanes_example,
event_time = "event_time_years",
event_status = "mortstat",
weight_column = "WTMEC4YR_norm",
variables = variables,
included_variables = NULL,
n_folds = 5,
seed = 1989,
max_model_size = 50,
verbose = TRUE)
conc = sapply(res, `[[`, "best_concordance")
threshold = 0.01
included_variables = names(conc)[c(1, diff(conc)) > threshold]
new_variables = c("diabetes", "stroke")
second_level = cforward(nhanes_example,
event_time = "event_time_years",
event_status = "mortstat",
weight_column = "WTMEC4YR_norm",
variables = new_variables,
included_variables = included_variables,
n_folds = 5,
seed = 1989,
max_model_size = 50,
verbose = TRUE)
second_conc = sapply(second_level, `[[`, "best_concordance")
result = second_level[[which.max(second_conc)]]
final_model = result$models[[which.max(result$cv_concordance)]]