Counterfactuals {counterfactuals} | R Documentation |
Counterfactuals Class
Description
A Counterfactuals
object should be created by the $find_counterfactuals
method of CounterfactualMethodRegr
or CounterfactualMethodClassif.
It contains the counterfactuals and has several methods for their evaluation and visualization.
Active bindings
desired
(
list(1)
|list(2)
)
Alist
with the desired properties of the counterfactuals. For regression tasks it has one elementdesired_outcome
(CounterfactualMethodRegr) and for classification tasks two elementsdesired_class
anddesired_prob
(CounterfactualMethodClassif).data
(
data.table
)
The counterfactuals forx_interest
.x_interest
(
data.table(1)
)
A single row with the observation of interest.distance_function
(
function()
)
The distance function used in the second and fourth evaluation measure. The function must have three arguments:x
,y
, anddata
and return anumeric
matrix. If set toNULL
(default), then Gower distance (Gower 1971) is used.method
(
character
)
A single row with the observation of interest.
Methods
Public methods
Method new()
Creates a new Counterfactuals
object.
This method should only be called by the $find_counterfactuals
methods of CounterfactualMethodRegr
and CounterfactualMethodClassif.
Usage
Counterfactuals$new( cfactuals, predictor, x_interest, param_set, desired, method = NULL )
Arguments
cfactuals
(
data.table
)
The counterfactuals. Must have the same column names and types aspredictor$data$X
.predictor
(Predictor)
The object (created withiml::Predictor$new()
) holding the machine learning model and the data.x_interest
(
data.table(1)
|data.frame(1)
)
A single row with the observation of interest.param_set
(ParamSet)
A ParamSet based on the features ofpredictor$data$X
.desired
(
list(1)
|list(2)
)
Alist
with the desired properties of the counterfactuals. It should have one elementdesired_outcome
for regression tasks (CounterfactualMethodRegr) and two elementsdesired_class
anddesired_prob
for classification tasks (CounterfactualMethodClassif).method
(
character
)
Name of the method with which counterfactuals were generated. Default is NULL which means that no name is provided.
Method evaluate()
Evaluates the counterfactuals. It returns the counterfactuals together with the evaluation measures
.
Usage
Counterfactuals$evaluate( measures = c("dist_x_interest", "dist_target", "no_changed", "dist_train", "minimality"), show_diff = FALSE, k = 1L, weights = NULL )
Arguments
measures
(
character
)
The name of one or more evaluation measures. The following measures are available:-
dist_x_interest
: The distance of a counterfactual tox_interest
measured by Gower's dissimilarity measure (Gower 1971). -
dist_target
: The absolute distance of the prediction for a counterfactual to the intervaldesired_outcome
(regression tasks) ordesired_prob
(classification tasks). -
no_changed
: The number of feature changes w.r.t.x_interest
. -
dist_train
: The (weighted) distance to thek
nearest training data points measured by Gower's dissimilarity measure (Gower 1971). -
minimality
: The number of changed features that each could be set to the value ofx_interest
while keeping the desired prediction value.
-
show_diff
(
logical(1)
)
Should the counterfactuals be displayed as their differences tox_interest
? Default isFALSE
. If set toTRUE
, positive values for numeric features indicate an increase compared to the feature value inx_interest
, negative values indicate a decrease. For factors, the feature value is displayed if it differs fromx_interest
;NA
means "no difference" in both cases.k
(
integerish(1)
)
How many nearest training points should be considered for computing thedist_train
measure? Default is1L
.weights
(
numeric(k)
|NULL
)
How should thek
nearest training points be weighted when computing thedist_train
measure? IfNULL
(default) then allk
points are weighted equally. If a numeric vector of lengthk
is given, the i-th element specifies the weight of the i-th closest data point.
Method evaluate_set()
Evaluates a set of counterfactuals. It returns the evaluation measures
.
Usage
Counterfactuals$evaluate_set( measures = c("diversity", "no_nondom", "frac_nondom", "hypervolume"), nadir = NULL )
Arguments
measures
(
character
)
The name of one or more evaluation measures. The following measures are available:-
diversity
: Diversity of returned counterfactuals in the feature space -
no_nondom
: Number of counterfactuals that are not dominated by other counterfactuals. -
frac_nondom
: Fraction of counterfactuals that are not dominated by other counterfactuals -
hypervolume
: Hypervolume of the induced Pareto front
-
nadir
(
numeric
)
Max objective values to calculate dominated hypervolume. Only considered, ifhypervolume
is one of themeasures
. May be a scalar, in which case it is used for all four objectives, or a vector of length 4. Default is NULL, meaning the nadir point by Dandl et al. (2020) is used: (min distance between prediction ofx_interest
todesired_prob/_outcome
, 1, number of features, 1).
Method predict()
Returns the predictions for the counterfactuals.
Usage
Counterfactuals$predict()
Method subset_to_valid()
Subset data to those meeting the desired prediction,
Process could be reverted using revert_subset_to_valid()
.
Usage
Counterfactuals$subset_to_valid()
Method revert_subset_to_valid()
Subset data to those meeting the desired prediction,
Process could be reverted using revert_subset_to_valid()
.
Usage
Counterfactuals$revert_subset_to_valid()
Method plot_parallel()
Plots a parallel plot that connects the (scaled) feature values of each counterfactual and highlights
x_interest
in blue.
Usage
Counterfactuals$plot_parallel( feature_names = NULL, row_ids = NULL, digits_min_max = 2L )
Arguments
feature_names
(
character
|NULL
)
The names of the (numeric) features to display. IfNULL
(default) all features are displayed.row_ids
(
integerish
|NULL
)
The row ids of the counterfactuals to display. IfNULL
(default) all counterfactuals are displayed.digits_min_max
Maximum number of digits for the minimum and maximum features values. Default is
2L
.
Method plot_freq_of_feature_changes()
Plots a bar chart with the frequency of feature changes across all counterfactuals.
Usage
Counterfactuals$plot_freq_of_feature_changes(subset_zero = FALSE)
Arguments
subset_zero
(
logical(1)
)
Should unchanged features be excluded from the plot? Default isFALSE
.
Method get_freq_of_feature_changes()
Returns the frequency of feature changes across all counterfactuals.
Usage
Counterfactuals$get_freq_of_feature_changes(subset_zero = FALSE)
Arguments
subset_zero
(
logical(1)
)
Should unchanged features be excluded? Default isFALSE
.
Returns
A (named) numeric
vector with the frequency of feature changes.
Method plot_surface()
Creates a surface plot for two features. x_interest
is represented as a white dot and
all counterfactuals that differ from x_interest
only in the two selected features are represented as black dots.
The tick marks next to the axes show the marginal distribution of the observed data (predictor$data$X
).
The exact plot type depends on the selected feature types and number of features:
2 numeric features: surface plot
2 non-numeric features: heatmap
1 numeric or non-numeric feature: line graph
Usage
Counterfactuals$plot_surface(feature_names, grid_size = 250L)
Arguments
feature_names
(
character(2)
)
The names of the features to plot.grid_size
(
integerish(1)
)
The grid size of the plot. It is ignored in case of twonon-numeric
features. Default is250L
.
Method print()
Prints the Counterfactuals
object.
Usage
Counterfactuals$print()
Method clone()
The objects of this class are cloneable with this method.
Usage
Counterfactuals$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
References
Gower, J. C. (1971), "A general coefficient of similarity and some of its properties". Biometrics, 27, 623–637.