shapley.feature.selection {shapley}R Documentation

Selects the top features with highest weighted mean shap values based on the specified criteria

Description

This function specifies the top features and prepares the data for plotting SHAP contributions for each row, or summary of absolute SHAP contributions for each feature.

Usage

shapley.feature.selection(
  shapley,
  method = "lowerCI",
  cutoff = 0,
  top_n_features = NULL,
  features = NULL
)

Arguments

shapley

shapley object

method

character, specifying the method used for identifying the most important features according to their weighted SHAP values. The default selection method is "lowerCI", which includes features whose lower weighted confidence interval exceeds the predefined 'cutoff' value (default is relative SHAP of 1 Alternatively, the "mean" option can be specified, indicating any feature with normalized weighted mean SHAP contribution above the specified 'cutoff' should be selected. Another alternative options is "shapratio", a method that filters for features where the proportion of their relative weighted SHAP value exceeds the 'cutoff'. This approach calculates the relative contribution of each feature's weighted SHAP value against the aggregate of all features, with those surpassing the 'cutoff' being selected as top feature.

cutoff

numeric, specifying the cutoff for the method used for selecting the top features. the default is zero, which means that all features with the "method" criteria above zero will be selected.

top_n_features

integer. if specified, the top n features with the highest weighted SHAP values will be selected, overrullung the 'cutoff' and 'method' arguments.

features

character vector, specifying the feature to be plotted.

Value

normalized numeric vector

Author(s)

E. F. Haghish


[Package shapley version 0.3 Index]