export_model_vimp {familiar} | R Documentation |
Extract and export model-based variable importance.
Description
Extract and export model-based variable importance from a familiarCollection.
Usage
export_model_vimp(
object,
dir_path = NULL,
aggregate_results = TRUE,
aggregation_method = waiver(),
rank_threshold = waiver(),
export_collection = FALSE,
...
)
## S4 method for signature 'familiarCollection'
export_model_vimp(
object,
dir_path = NULL,
aggregate_results = TRUE,
aggregation_method = waiver(),
rank_threshold = waiver(),
export_collection = FALSE,
...
)
## S4 method for signature 'ANY'
export_model_vimp(
object,
dir_path = NULL,
aggregate_results = TRUE,
aggregation_method = waiver(),
rank_threshold = waiver(),
export_collection = FALSE,
...
)
Arguments
object |
A |
dir_path |
Path to folder where extracted data should be saved. |
aggregate_results |
Flag that signifies whether results should be aggregated for export. |
aggregation_method |
(optional) The method used to aggregate variable importances over different data subsets, e.g. bootstraps. The following methods can be selected:
|
rank_threshold |
(optional) The threshold used to define the subset of highly important features. If not set, this threshold is determined by maximising the variance in the occurrence value over all features over the subset size. This parameter is only relevant for |
export_collection |
(optional) Exports the collection if TRUE. |
... |
Arguments passed on to
|
Details
Data, such as model performance and calibration information, is
usually collected from a familiarCollection
object. However, you can also
provide one or more familiarData
objects, that will be internally
converted to a familiarCollection
object. It is also possible to provide a
familiarEnsemble
or one or more familiarModel
objects together with the
data from which data is computed prior to export. Paths to the previous
files can also be provided.
All parameters aside from object
and dir_path
are only used if object
is not a familiarCollection
object, or a path to one.
Variable importance is based on the ranking produced by model-specific
variable importance routines, e.g. permutation for random forests. If such a
routine is absent, variable importance is based on the feature selection
method that led to the features included in the model. In case multiple
models (familiarModel
objects) are combined, feature ranks are first
aggregated using the method defined by the aggregation_method
, some of
which require a rank_threshold
to indicate a subset of most important
features.
Information concerning highly similar features that form clusters is provided as well. This information is based on consensus clustering of the features that were used in the signatures of the underlying models. This clustering information is also used during aggregation to ensure that co-clustered features are only taken into account once.
Value
A data.table (if dir_path
is not provided), or nothing, as all data
is exported to csv
files.