| NICERegr {counterfactuals} | R Documentation |
NICE (Nearest Instance Counterfactual Explanations) for Regression Tasks
Description
NICE (Brughmans and Martens 2021) searches for counterfactuals by iteratively replacing feature values
of x_interest with the corresponding value of its most similar (optionally correctly predicted) instance x_nn.
While the original method is only applicable to classification tasks (see NICEClassif), this implementation extend it to regression tasks.
Details
NICE starts the counterfactual search for x_interest by finding its most similar (optionally) correctly predicted
neighbor x_nn with(in) the desired prediction (range). Correctly predicted means that the prediction of x_nn is less
than a user-specified margin_correct away from the true outcome of x_nn.
This is designed to mimic the search for x_nn for regression tasks.
If no x_nn satisfies this constraint, a warning is returned that no counterfactual could be found.
In the first iteration, NICE creates new instances by replacing a different feature value of x_interest with the corresponding
value of x_nn in each new instance. Thus, if x_nn differs from x_interest in d features, d new instances are created.
Then, the reward values for the created instances are computed with the chosen reward function.
Available reward functions are sparsity, proximity, and plausibility.
In the second iteration, NICE creates d-1 new instances by replacing a different feature value of the highest
reward instance of the previous iteration with the corresponding value of x_interest, and so on.
If finish_early = TRUE, the algorithm terminates when the predicted outcome for
the highest reward instance is in the interval desired_outcome; if finish_early = FALSE, the
algorithm continues until x_nn is recreated.
Once the algorithm terminated, it depends on return_multiple which instances
are returned as counterfactuals: if return_multiple = FALSE, then only the highest reward instance in the
last iteration is returned as counterfactual; if return_multiple = TRUE, then all instances (of all iterations)
whose predicted outcome is in the interval desired_outcome are returned as counterfactuals.
If finish_early = FALSE and return_multiple = FALSE, then x_nn is returned as single counterfactual.
The function computes the dissimilarities using Gower's dissimilarity measure (Gower 1971).
Super classes
counterfactuals::CounterfactualMethod -> counterfactuals::CounterfactualMethodRegr -> NICERegr
Active bindings
x_nn(
logical(1))
The most similar (optionally) correctly classified instance ofx_interest.archive(
list())
A list that stores the history of the algorithm run. For each algorithm iteration, it has one element containing adata.table, which stores all created instances of this iteration together with their reward values and their predictions.
Methods
Public methods
Inherited methods
Method new()
Create a new NICERegr object.
Usage
NICERegr$new( predictor, optimization = "sparsity", x_nn_correct = TRUE, margin_correct = NULL, return_multiple = FALSE, finish_early = TRUE, distance_function = "gower" )
Arguments
predictor(Predictor)
The object (created withiml::Predictor$new()) holding the machine learning model and the data.optimization(
character(1))
The reward function to optimize. Can besparsity(default),proximityorplausibility.x_nn_correct(
logical(1))
Should only correctly classified data points inpredictor$data$Xbe considered for the most similar instance search? Default isTRUE.margin_correct(
numeric(1)|NULL)
The accepted margin for considering a prediction as "correct". Ignored ifx_nn_correct = FALSE. If NULL, the accepted margin is set to half the median absolute distance between the true and predicted outcomes in the data (predictor$data).return_multiple(
logical(1))
Should multiple counterfactuals be returned? If TRUE, the algorithm returns all created instances whose prediction is in the intervaldesired_outcome. For more information, see theDetailssection.finish_early(
logical(1))
Should the algorithm terminate after an iteration in which the prediction for the highest reward instance is in the intervaldesired_outcome. IfFALSE, the algorithm continues untilx_nnis recreated.distance_function(
function()|'gower'|'gower_c')
The distance function used to compute the distances betweenx_interestand the training data points for findingx_nn. Ifoptimizationis set toproximity, the distance function is also used for calculating the distance between candidates andx_interest. Either the name of an already implemented distance function ('gower' or 'gower_c') or a function is allowed as input. If set to 'gower' (default), then Gower's distance (Gower 1971) is used; if set to 'gower_c', a C-based more efficient version of Gower's distance is used. A function must have three argumentsx,y, anddataand should return adoublematrix withnrow(x)rows and maximumnrow(y)columns.
Method clone()
The objects of this class are cloneable with this method.
Usage
NICERegr$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
References
Brughmans, D., & Martens, D. (2021). NICE: An Algorithm for Nearest Instance Counterfactual Explanations. arXiv 2104.07411 v2.
Gower, J. C. (1971), "A general coefficient of similarity and some of its properties". Biometrics, 27, 623–637.
Examples
if (require("randomForest")) {
set.seed(123456)
# Train a model
rf = randomForest(mpg ~ ., data = mtcars)
# Create a predictor object
predictor = iml::Predictor$new(rf)
# Find counterfactuals
nice_regr = NICERegr$new(predictor)
cfactuals = nice_regr$find_counterfactuals(
x_interest = mtcars[1L, ], desired_outcome = c(22, 26)
)
# Print the results
cfactuals$data
# Print archive
nice_regr$archive
}