NICERegr {counterfactuals} | R Documentation |
NICE (Nearest Instance Counterfactual Explanations) for Regression Tasks
Description
NICE (Brughmans and Martens 2021) searches for counterfactuals by iteratively replacing feature values
of x_interest
with the corresponding value of its most similar (optionally correctly predicted) instance x_nn
.
While the original method is only applicable to classification tasks (see NICEClassif), this implementation extend it to regression tasks.
Details
NICE starts the counterfactual search for x_interest
by finding its most similar (optionally) correctly predicted
neighbor x_nn
with(in) the desired prediction (range). Correctly predicted means that the prediction of x_nn
is less
than a user-specified margin_correct
away from the true outcome of x_nn
.
This is designed to mimic the search for x_nn
for regression tasks.
If no x_nn
satisfies this constraint, a warning is returned that no counterfactual could be found.
In the first iteration, NICE creates new instances by replacing a different feature value of x_interest
with the corresponding
value of x_nn
in each new instance. Thus, if x_nn
differs from x_interest
in d
features, d
new instances are created.
Then, the reward values for the created instances are computed with the chosen reward function.
Available reward functions are sparsity
, proximity
, and plausibility
.
In the second iteration, NICE creates d-1
new instances by replacing a different feature value of the highest
reward instance of the previous iteration with the corresponding value of x_interest
, and so on.
If finish_early = TRUE
, the algorithm terminates when the predicted outcome for
the highest reward instance is in the interval desired_outcome
; if finish_early = FALSE
, the
algorithm continues until x_nn
is recreated.
Once the algorithm terminated, it depends on return_multiple
which instances
are returned as counterfactuals: if return_multiple = FALSE
, then only the highest reward instance in the
last iteration is returned as counterfactual; if return_multiple = TRUE
, then all instances (of all iterations)
whose predicted outcome is in the interval desired_outcome
are returned as counterfactuals.
If finish_early = FALSE
and return_multiple = FALSE
, then x_nn
is returned as single counterfactual.
The function computes the dissimilarities using Gower's dissimilarity measure (Gower 1971).
Super classes
counterfactuals::CounterfactualMethod
-> counterfactuals::CounterfactualMethodRegr
-> NICERegr
Active bindings
x_nn
(
logical(1)
)
The most similar (optionally) correctly classified instance ofx_interest
.archive
(
list()
)
A list that stores the history of the algorithm run. For each algorithm iteration, it has one element containing adata.table
, which stores all created instances of this iteration together with their reward values and their predictions.
Methods
Public methods
Inherited methods
Method new()
Create a new NICERegr object.
Usage
NICERegr$new( predictor, optimization = "sparsity", x_nn_correct = TRUE, margin_correct = NULL, return_multiple = FALSE, finish_early = TRUE, distance_function = "gower" )
Arguments
predictor
(Predictor)
The object (created withiml::Predictor$new()
) holding the machine learning model and the data.optimization
(
character(1)
)
The reward function to optimize. Can besparsity
(default),proximity
orplausibility
.x_nn_correct
(
logical(1)
)
Should only correctly classified data points inpredictor$data$X
be considered for the most similar instance search? Default isTRUE
.margin_correct
(
numeric(1)
|NULL
)
The accepted margin for considering a prediction as "correct". Ignored ifx_nn_correct = FALSE
. If NULL, the accepted margin is set to half the median absolute distance between the true and predicted outcomes in the data (predictor$data
).return_multiple
(
logical(1)
)
Should multiple counterfactuals be returned? If TRUE, the algorithm returns all created instances whose prediction is in the intervaldesired_outcome
. For more information, see theDetails
section.finish_early
(
logical(1)
)
Should the algorithm terminate after an iteration in which the prediction for the highest reward instance is in the intervaldesired_outcome
. IfFALSE
, the algorithm continues untilx_nn
is recreated.distance_function
(
function()
|'gower'
|'gower_c'
)
The distance function used to compute the distances betweenx_interest
and the training data points for findingx_nn
. Ifoptimization
is set toproximity
, the distance function is also used for calculating the distance between candidates andx_interest
. Either the name of an already implemented distance function ('gower' or 'gower_c') or a function is allowed as input. If set to 'gower' (default), then Gower's distance (Gower 1971) is used; if set to 'gower_c', a C-based more efficient version of Gower's distance is used. A function must have three argumentsx
,y
, anddata
and should return adouble
matrix withnrow(x)
rows and maximumnrow(y)
columns.
Method clone()
The objects of this class are cloneable with this method.
Usage
NICERegr$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
References
Brughmans, D., & Martens, D. (2021). NICE: An Algorithm for Nearest Instance Counterfactual Explanations. arXiv 2104.07411 v2.
Gower, J. C. (1971), "A general coefficient of similarity and some of its properties". Biometrics, 27, 623–637.
Examples
if (require("randomForest")) {
set.seed(123456)
# Train a model
rf = randomForest(mpg ~ ., data = mtcars)
# Create a predictor object
predictor = iml::Predictor$new(rf)
# Find counterfactuals
nice_regr = NICERegr$new(predictor)
cfactuals = nice_regr$find_counterfactuals(
x_interest = mtcars[1L, ], desired_outcome = c(22, 26)
)
# Print the results
cfactuals$data
# Print archive
nice_regr$archive
}