intrinsic_selection {flevr} | R Documentation |
Perform intrinsic, ensemble-based variable selection
Description
Based on estimated SPVIM values, do variable selection using the specified error-controlling method.
Usage
intrinsic_selection(
spvim_ests = NULL,
sample_size = NULL,
feature_names = "",
alpha = 0.05,
control = list(quantity = "gFWER", base_method = "Holm", fdr_method = NULL, q = NULL, k
= NULL)
)
Arguments
spvim_ests |
the estimated SPVIM values (an object of class |
sample_size |
the number of independent observations used to estimate the SPVIM values. |
feature_names |
the names of the features (a character vector of
length |
alpha |
the nominal generalized family-wise error rate, proportion of false positives, or false discovery rate level to control at (e.g., 0.05). |
control |
a list of parameters to control the variable selection process.
Parameters include |
Value
a tibble with the estimated intrinsic variable importance, the corresponding variable importance ranks, and the selected variables.
See Also
sp_vim
for specific usage of
the sp_vim
function and the vimp
package for estimating
intrinsic variable importance.
Examples
data("biomarkers")
# subset to complete cases for illustration
cc <- complete.cases(biomarkers)
dat_cc <- biomarkers[cc, ]
# use only the mucinous outcome, not the high-malignancy outcome
y <- dat_cc$mucinous
x <- dat_cc[, !(names(dat_cc) %in% c("mucinous", "high_malignancy"))]
feature_nms <- names(x)
# estimate SPVIMs (using simple library and V = 2 for illustration only)
set.seed(20231129)
library("SuperLearner")
est <- vimp::sp_vim(Y = y, X = x, V = 2, type = "auc", SL.library = "SL.glm",
cvControl = list(V = 2))
# do intrinsic selection
intrinsic_set <- intrinsic_selection(spvim_ests = est, sample_size = nrow(dat_cc), alpha = 0.2,
feature_names = feature_nms,
control = list(quantity = "gFWER", base_method = "Holm",
k = 1))
intrinsic_set