| fselect {mlr3fselect} | R Documentation |
Function for Feature Selection
Description
Function to optimize the features of a mlr3::Learner.
The function internally creates a FSelectInstanceBatchSingleCrit or FSelectInstanceBatchMultiCrit which describes the feature selection problem.
It executes the feature selection with the FSelector (method) and returns the result with the fselect instance ($result).
The ArchiveBatchFSelect ($archive) stores all evaluated hyperparameter configurations and performance scores.
Usage
fselect(
fselector,
task,
learner,
resampling,
measures = NULL,
term_evals = NULL,
term_time = NULL,
terminator = NULL,
store_benchmark_result = TRUE,
store_models = FALSE,
check_values = FALSE,
callbacks = NULL,
ties_method = "least_features"
)
Arguments
fselector |
(FSelector) |
task |
(mlr3::Task) |
learner |
(mlr3::Learner) |
resampling |
(mlr3::Resampling) |
measures |
(mlr3::Measure or list of mlr3::Measure) |
term_evals |
( |
term_time |
( |
terminator |
(bbotk::Terminator) |
store_benchmark_result |
( |
store_models |
( |
check_values |
( |
callbacks |
(list of CallbackBatchFSelect) |
ties_method |
( |
Details
The mlr3::Task, mlr3::Learner, mlr3::Resampling, mlr3::Measure and bbotk::Terminator are used to construct a FSelectInstanceBatchSingleCrit.
If multiple performance Measures are supplied, a FSelectInstanceBatchMultiCrit is created.
The parameter term_evals and term_time are shortcuts to create a bbotk::Terminator.
If both parameters are passed, a bbotk::TerminatorCombo is constructed.
For other Terminators, pass one with terminator.
If no termination criterion is needed, set term_evals, term_time and terminator to NULL.
Value
FSelectInstanceBatchSingleCrit | FSelectInstanceBatchMultiCrit
Resources
There are several sections about feature selection in the mlr3book.
Getting started with wrapper feature selection.
Do a sequential forward selection Palmer Penguins data set.
The gallery features a collection of case studies and demos about optimization.
Utilize the built-in feature importance of models with Recursive Feature Elimination.
Run a feature selection with Shadow Variable Search.
-
Feature Selection on the Titanic data set.
Analysis
For analyzing the feature selection results, it is recommended to pass the archive to as.data.table().
The returned data table is joined with the benchmark result which adds the mlr3::ResampleResult for each feature set.
The archive provides various getters (e.g. $learners()) to ease the access.
All getters extract by position (i) or unique hash (uhash).
For a complete list of all getters see the methods section.
The benchmark result ($benchmark_result) allows to score the feature sets again on a different measure.
Alternatively, measures can be supplied to as.data.table().
Examples
# Feature selection on the Palmer Penguins data set
task = tsk("pima")
learner = lrn("classif.rpart")
# Run feature selection
instance = fselect(
fselector = fs("random_search"),
task = task,
learner = learner,
resampling = rsmp ("holdout"),
measures = msr("classif.ce"),
term_evals = 4)
# Subset task to optimized feature set
task$select(instance$result_feature_set)
# Train the learner with optimal feature set on the full data set
learner$train(task)
# Inspect all evaluated configurations
as.data.table(instance$archive)