InterpretingMethod {innsight} | R Documentation |
Super class for interpreting methods
Description
This is a super class for all interpreting methods in the
innsight
package. Implemented are the following methods:
-
Deep Learning Important Features (
DeepLift
) -
Deep Shapley additive explanations (
DeepSHAP
) -
Layer-wise Relevance Propagation (
LRP
) Gradient-based methods:
-
Vanilla gradients including Gradient
\times
Input (Gradient
) Smoothed gradients including SmoothGrad
\times
Input (SmoothGrad
)-
Integrated gradients (
IntegratedGradient
) -
Expected gradients (
ExpectedGradient
)
-
-
Connection Weights (global and local) (
ConnectionWeights
) Also some model-agnostic approaches:
Public fields
data
(
list
)
The passed data as alist
oftorch_tensors
in the selected data format (fielddtype
) matching the corresponding shapes of the individual input layers. Besides, the channel axis is moved to the second position after the batch size because internally only the format channels first is used.converter
(
Converter
)
An instance of theConverter
class that includes the torch-converted model and some other model-specific attributes. SeeConverter
for details.channels_first
(
logical(1)
)
The channel position of the given data. IfTRUE
, the channel axis is placed at the second position between the batch size and the rest of the input axes, e.g.,c(10,3,32,32)
for a batch of ten images with three channels and a height and width of 32 pixels. Otherwise (FALSE
), the channel axis is at the last position, i.e.,c(10,32,32,3)
. This is especially important for layers like flatten, where the order is crucial and therefore the channels have to be moved from the internal format "channels first" back to the original format before the layer is calculated.dtype
(
character(1)
)
The data type for the calculations. Either'float'
for torch_float or'double'
for torch_double.ignore_last_act
(
logical(1)
)
A logical value to include the last activation functions into all the calculations, or not.result
(
list
)
The results of the method on the passed data. A unified list structure is used regardless of the complexity of the model: The outer list contains the individual output layers and the inner list the input layers. The results for the respective output and input layer are then stored there as torch tensors in the given data format (fielddtype
). In addition, the channel axis is moved to its original place and the last axis contains the selected output nodes for the individual output layers (seeoutput_idx
).
For example, the structure of the result for two output layers (output node 1 for the first and 2 and 4 for the second) and two input layers withchannels_first = FALSE
looks like this:List of 2 # both output layers $ :List of 2 # both input layers ..$ : torch_tensor [batch_size, dim_in_1, channel_axis, 1] ..$ : torch_tensor [batch_size, dim_in_2, channel_axis, 1] $ :List of 2 # both input layers ..$ : torch_tensor [batch_size, dim_in_1, channel_axis, 2] ..$ : torch_tensor [batch_size, dim_in_2, channel_axis, 2]
output_idx
(
list
)
This list of indices specifies the output nodes to which the method is to be applied. In the order of the output layers, the list contains the respective output nodes indices and unwanted output layers have the entryNULL
instead of a vector of indices, e.g.,list(NULL, c(1,3))
for the first and third output node in the second output layer.output_label
(
list
)
This list offactors
specifies the output nodes to which the method is to be applied. In the order of the output layers, the list contains the respective output nodes labels and unwanted output layers have the entryNULL
instead of a vector of labels, e.g.,list(NULL, c("a", "c"))
for the first and third output node in the second output layer.verbose
(
logical(1)
)
This logical value determines whether a progress bar is displayed for the calculation of the method or not. The default value is the output of the primitive R functioninteractive()
.winner_takes_all
(
logical(1)
)
This logical value is only relevant for models with a MaxPooling layer. Since many zeros are produced during the backward pass due to the selection of the maximum value in the pooling kernel, another variant is implemented, which treats a MaxPooling as an AveragePooling layer in the backward pass to overcome the problem of too many zero relevances. With the default valueTRUE
, the whole upper-layer relevance is passed to the maximum value in each pooling window. Otherwise, ifFALSE
, the relevance is distributed equally among all nodes in a pooling window.preds
(
list
)
In this field, all calculated predictions are stored as a list oftorch_tensor
s. Each output layer has its own list entry and contains the respective predicted values.decomp_goal
(
list
)
In this field, the method-specific decomposition objectives are stored as a list oftorch_tensor
s for each output layer. For example, GradientxInput and LRP attempt to decompose the prediction into feature-wise additive effects. DeepLift and IntegratedGradient decompose the difference betweenf(x)
andf(x')
. On the other hand, DeepSHAP and ExpectedGradient aim to decomposef(x)
minus the averaged prediction across the reference values.
Methods
Public methods
Method new()
Create a new instance of this super class.
Usage
InterpretingMethod$new( converter, data, channels_first = TRUE, output_idx = NULL, output_label = NULL, ignore_last_act = TRUE, winner_takes_all = TRUE, verbose = interactive(), dtype = "float" )
Arguments
converter
(
Converter
)
An instance of theConverter
class that includes the torch-converted model and some other model-specific attributes. SeeConverter
for details.data
(
array
,data.frame
,torch_tensor
orlist
)
The data to which the method is to be applied. These must have the same format as the input data of the passed model to the converter object. This means eitheran
array
,data.frame
,torch_tensor
or array-like format of size (batch_size, dim_in), if e.g., the model has only one input layer, ora
list
with the corresponding input data (according to the upper point) for each of the input layers.
channels_first
(
logical(1)
)
The channel position of the given data (argumentdata
). IfTRUE
, the channel axis is placed at the second position between the batch size and the rest of the input axes, e.g.,c(10,3,32,32)
for a batch of ten images with three channels and a height and width of 32 pixels. Otherwise (FALSE
), the channel axis is at the last position, i.e.,c(10,32,32,3)
. If the data has no channel axis, use the default valueTRUE
.output_idx
(
integer
,list
orNULL
)
These indices specify the output nodes for which the method is to be applied. In order to allow models with multiple output layers, there are the following possibilities to select the indices of the output nodes in the individual output layers:An
integer
vector of indices: If the model has only one output layer, the values correspond to the indices of the output nodes, e.g.c(1,3,4)
for the first, third and fourth output node. If there are multiple output layers, the indices of the output nodes from the first output layer are considered.A
list
ofinteger
vectors of indices: If the method is to be applied to output nodes from different layers, a list can be passed that specifies the desired indices of the output nodes for each output layer. Unwanted output layers have the entryNULL
instead of a vector of indices, e.g.list(NULL, c(1,3))
for the first and third output node in the second output layer.-
NULL
(default): The method is applied to all output nodes in the first output layer but is limited to the first ten as the calculations become more computationally expensive for more output nodes.
output_label
(
character
,factor
,list
orNULL
)
These values specify the output nodes for which the method is to be applied. Only values that were previously passed with the argumentoutput_names
in theconverter
can be used. In order to allow models with multiple output layers, there are the following possibilities to select the names of the output nodes in the individual output layers:A
character
vector orfactor
of labels: If the model has only one output layer, the values correspond to the labels of the output nodes named in the passedConverter
object, e.g.,c("a", "c", "d")
for the first, third and fourth output node if the output names arec("a", "b", "c", "d")
. If there are multiple output layers, the names of the output nodes from the first output layer are considered.A
list
ofcharactor
/factor
vectors of labels: If the method is to be applied to output nodes from different layers, a list can be passed that specifies the desired labels of the output nodes for each output layer. Unwanted output layers have the entryNULL
instead of a vector of labels, e.g.,list(NULL, c("a", "c"))
for the first and third output node in the second output layer.-
NULL
(default): The method is applied to all output nodes in the first output layer but is limited to the first ten as the calculations become more computationally expensive for more output nodes.
ignore_last_act
(
logical(1)
)
Set this logical value to include the last activation functions for each output layer, or not (default:TRUE
). In practice, the last activation (especially for softmax activation) is often omitted.winner_takes_all
(
logical(1)
)
This logical argument is only relevant for models with a MaxPooling layer. Since many zeros are produced during the backward pass due to the selection of the maximum value in the pooling kernel, another variant is implemented, which treats a MaxPooling as an AveragePooling layer in the backward pass to overcome the problem of too many zero relevances. With the default valueTRUE
, the whole upper-layer relevance is passed to the maximum value in each pooling window. Otherwise, ifFALSE
, the relevance is distributed equally among all nodes in a pooling window.verbose
(
logical(1)
)
This logical argument determines whether a progress bar is displayed for the calculation of the method or not. The default value is the output of the primitive R functioninteractive()
.dtype
(
character(1)
)
The data type for the calculations. Use either'float'
for torch_float or'double'
for torch_double.
Method get_result()
This function returns the result of this method for the given data
either as an array ('array'
), a torch tensor ('torch.tensor'
,
or 'torch_tensor'
) of size (batch_size, dim_in, dim_out) or as a
data.frame ('data.frame'
). This method is also implemented as a
generic S3 function get_result
. For a detailed description, we refer
to our in-depth vignette (vignette("detailed_overview", package = "innsight")
)
or our website.
Usage
InterpretingMethod$get_result(type = "array")
Arguments
type
(
character(1)
)
The data type of the result. Use one of'array'
,'torch.tensor'
,'torch_tensor'
or'data.frame'
(default:'array'
).
Returns
The result of this method for the given data in the chosen type.
Method plot()
This method visualizes the result of the selected
method and enables a visual in-depth investigation with the help
of the S4 classes innsight_ggplot2
and innsight_plotly
.
You can use the argument data_idx
to select the data points in the
given data for the plot. In addition, the individual output nodes for
the plot can be selected with the argument output_idx
. The different
results for the selected data points and outputs are visualized using
the ggplot2-based S4 class innsight_ggplot2
. You can also use the
as_plotly
argument to generate an interactive plot with
innsight_plotly
based on the plot function plotly::plot_ly. For
more information and the whole bunch of possibilities,
see innsight_ggplot2
and innsight_plotly
.
Notes:
For the interactive plotly-based plots, the suggested package
plotly
is required.The ggplot2-based plots for models with multiple input layers are a bit more complex, therefore the suggested packages
'grid'
,'gridExtra'
and'gtable'
must be installed in your R session.If the global Connection Weights method was applied, the unnecessary argument
data_idx
will be ignored.The predictions, the sum of relevances, and, if available, the decomposition target are displayed by default in a box within the plot. Currently, these are not generated for
plotly
plots.
Usage
InterpretingMethod$plot( data_idx = 1, output_idx = NULL, output_label = NULL, aggr_channels = "sum", as_plotly = FALSE, same_scale = FALSE, show_preds = TRUE )
Arguments
data_idx
(
integer
)
An integer vector containing the numbers of the data points whose result is to be plotted, e.g.,c(1,3)
for the first and third data point in the given data. Default:1
. This argument will be ignored for the global Connection Weights method.output_idx
(
integer
,list
orNULL
)
The indices of the output nodes for which the results is to be plotted. This can be either ainteger
vector of indices or alist
ofinteger
vectors of indices but must be a subset of the indices for which the results were calculated, i.e., a subset ofoutput_idx
from the initializationnew()
(see argumentoutput_idx
in methodnew()
of this R6 class for details). By default (NULL
), the smallest index of all calculated output nodes and output layers is used.output_label
(
character
,factor
,list
orNULL
)
These values specify the output nodes for which the method is to be applied. Only values that were previously passed with the argumentoutput_names
in theconverter
can be used. In order to allow models with multiple output layers, there are the following possibilities to select the names of the output nodes in the individual output layers:A
character
vector orfactor
of labels: If the model has only one output layer, the values correspond to the labels of the output nodes named in the passedConverter
object, e.g.,c("a", "c", "d")
for the first, third and fourth output node if the output names arec("a", "b", "c", "d")
. If there are multiple output layers, the names of the output nodes from the first output layer are considered.A
list
ofcharactor
/factor
vectors of labels: If the method is to be applied to output nodes from different layers, a list can be passed that specifies the desired labels of the output nodes for each output layer. Unwanted output layers have the entryNULL
instead of a vector of labels, e.g.,list(NULL, c("a", "c"))
for the first and third output node in the second output layer.-
NULL
(default): The method is applied to all output nodes in the first output layer but is limited to the first ten as the calculations become more computationally expensive for more output nodes.
aggr_channels
(
character(1)
orfunction
)
Pass one of'norm'
,'sum'
,'mean'
or a custom function to aggregate the channels, e.g., the maximum (base::max) or minimum (base::min) over the channels or only individual channels withfunction(x) x[1]
. By default ('sum'
), the sum of all channels is used.
Note: This argument is used only for 2D and 3D input data.as_plotly
(
logical(1)
)
This logical value (default:FALSE
) can be used to create an interactive plot based on the libraryplotly
(seeinnsight_plotly
for details).
Note: Make sure that the suggested packageplotly
is installed in your R session.same_scale
(
logical
)
A logical value that specifies whether the individual plots have the same fill scale across multiple input layers or whether each is scaled individually. This argument is only used if more than one input layer results are plotted.show_preds
(
logical
)
This logical value indicates whether the plots display the prediction, the sum of calculated relevances, and, if available, the targeted decomposition value. For example, in the case of GradientxInput, the goal is to obtain a decomposition of the predicted value, while for DeepLift and IntegratedGradient, the goal is the difference between the prediction and the reference value, i.e.,f(x) - f(x')
.
Returns
Returns either an innsight_ggplot2
(as_plotly = FALSE
) or an
innsight_plotly
(as_plotly = TRUE
) object with the plotted
individual results.
Method plot_global()
This method visualizes the results of the selected method summarized as
boxplots/median image and enables a visual in-depth investigation of the global
behavior with the help of the S4 classes innsight_ggplot2
and
innsight_plotly
.
You can use the argument output_idx
to select the individual output
nodes for the plot. For tabular and 1D data, boxplots are created in
which a reference value can be selected from the data using the
ref_data_idx
argument. For images, only the pixel-wise median is
visualized due to the complexity. The plot is generated using the
ggplot2-based S4 class innsight_ggplot2
. You can also use the
as_plotly
argument to generate an interactive plot with
innsight_plotly
based on the plot function plotly::plot_ly. For
more information and the whole bunch of possibilities, see
innsight_ggplot2
and innsight_plotly
.
Notes:
This method can only be used for the local Connection Weights method, i.e., if
times_input
isTRUE
anddata
is provided.For the interactive plotly-based plots, the suggested package
plotly
is required.The ggplot2-based plots for models with multiple input layers are a bit more complex, therefore the suggested packages
'grid'
,'gridExtra'
and'gtable'
must be installed in your R session.
Usage
InterpretingMethod$plot_global( output_idx = NULL, output_label = NULL, data_idx = "all", ref_data_idx = NULL, aggr_channels = "sum", preprocess_FUN = abs, as_plotly = FALSE, individual_data_idx = NULL, individual_max = 20 )
Arguments
output_idx
(
integer
,list
orNULL
)
The indices of the output nodes for which the results is to be plotted. This can be either avector
of indices or alist
of vectors of indices but must be a subset of the indices for which the results were calculated, i.e., a subset ofoutput_idx
from the initializationnew()
(see argumentoutput_idx
in methodnew()
of this R6 class for details). By default (NULL
), the smallest index of all calculated output nodes and output layers is used.output_label
(
character
,factor
,list
orNULL
)
These values specify the output nodes for which the method is to be applied. Only values that were previously passed with the argumentoutput_names
in theconverter
can be used. In order to allow models with multiple output layers, there are the following possibilities to select the names of the output nodes in the individual output layers:A
character
vector orfactor
of labels: If the model has only one output layer, the values correspond to the labels of the output nodes named in the passedConverter
object, e.g.,c("a", "c", "d")
for the first, third and fourth output node if the output names arec("a", "b", "c", "d")
. If there are multiple output layers, the names of the output nodes from the first output layer are considered.A
list
ofcharactor
/factor
vectors of labels: If the method is to be applied to output nodes from different layers, a list can be passed that specifies the desired labels of the output nodes for each output layer. Unwanted output layers have the entryNULL
instead of a vector of labels, e.g.,list(NULL, c("a", "c"))
for the first and third output node in the second output layer.-
NULL
(default): The method is applied to all output nodes in the first output layer but is limited to the first ten as the calculations become more computationally expensive for more output nodes.
data_idx
(
integer
)
By default, all available data points are used to calculate the boxplot information. However, this parameter can be used to select a subset of them by passing the indices. For example, withc(1:10, 25, 26)
only the first 10 data points and the 25th and 26th are used to calculate the boxplots.ref_data_idx
(
integer(1)
orNULL
)
This integer number determines the index for the reference data point. In addition to the boxplots, it is displayed in red color and is used to compare an individual result with the summary statistics provided by the boxplot. With the default value (NULL
), no individual data point is plotted. This index can be chosen with respect to all available data, even if only a subset is selected with argumentdata_idx
.
Note: Because of the complexity of 2D inputs, this argument is used only for tabular and 1D inputs and disregarded for 2D inputs.aggr_channels
(
character(1)
orfunction
)
Pass one of'norm'
,'sum'
,'mean'
or a custom function to aggregate the channels, e.g., the maximum (base::max) or minimum (base::min) over the channels or only individual channels withfunction(x) x[1]
. By default ('sum'
), the sum of all channels is used.
Note: This argument is used only for 2D and 3D input data.preprocess_FUN
(
function
)
This function is applied to the method's result before calculating the boxplots or medians. Since positive and negative values often cancel each other out, the absolute value (abs
) is used by default. But you can also use the raw results (identity
) to see the results' orientation, the squared data (function(x) x^2
) to weight the outliers higher or any other function.as_plotly
(
logical(1)
)
This logical value (default:FALSE
) can be used to create an interactive plot based on the libraryplotly
(seeinnsight_plotly
for details).
Note: Make sure that the suggested packageplotly
is installed in your R session.individual_data_idx
(
integer
orNULL
)
Only relevant for aplotly
plot with tabular or 1D inputs! This integer vector of data indices determines the available data points in a dropdown menu, which are drawn individually analogous toref_data_idx
only for more data points. With the default valueNULL
, the firstindividual_max
data points are used.
Note: Ifref_data_idx
is specified, this data point will be added to those fromindividual_data_idx
in the dropdown menu.individual_max
(
integer(1)
)
Only relevant for aplotly
plot with tabular or 1D inputs! This integer determines the maximum number of individual data points in the dropdown menu without countingref_data_idx
. This means that ifindividual_data_idx
has more thanindividual_max
indices, only the firstindividual_max
will be used. A too high number can significantly increase the runtime.
Returns
Returns either an innsight_ggplot2
(as_plotly = FALSE
) or an
innsight_plotly
(as_plotly = TRUE
) object with the plotted
summarized results.
Method print()
Print a summary of the method object. This summary contains the individual fields and in particular the results of the applied method.
Usage
InterpretingMethod$print()
Returns
Returns the method object invisibly via base::invisible
.
Method clone()
The objects of this class are cloneable with this method.
Usage
InterpretingMethod$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.