IDEA {FACT}R Documentation

Idea - Isolated Effect on Assignment

Description

IDEA with a soft label predictor (sIDEA)
tacks changes the soft label of being assigned to each existing cluster throughout a (multidimensional) feature space IDEA with a hard label predictor (hIDEA)
tacks changes the soft label of being assigned to each existing cluster throughout a (multidimensional) feature space

Details

IDEA for soft labeling algorithms (sIDEA) indicates the soft label that an observation \textbf{x} with replaced values \tilde{\textbf{x}}_S is assigned to the k-th cluster. IDEA for hard labeling algorithms (hIDEA) indicates the cluster assignment of an observation \textbf{x} with replaced values \tilde{\textbf{x}}_S.

The global IDEA is denoted by the corresponding data set X:

\text{sIDEA}_X(\tilde{\textbf{x}}_S) = \left(\frac{1}{n} \sum_{i = 1}^n \text{sIDEA}^{(1)}_{\textbf{x}^{(i)}}(\tilde{\textbf{x}}_S), \dots, \frac{1}{n} \sum_{i = 1}^n \text{sIDEA}^{(k)}_{\textbf{x}^{(i)}}(\tilde{\textbf{x}}_S) \right)

where the c-th vector element is the average c-th vector element of local sIDEA functions. The global hIDEA corresponds to:

\text{hIDEA}_X(\tilde{\textbf{x}}_S) = \left(\frac{1}{n}\sum_{i = 1}^n \mathbb{1}_{1}(\text{hIDEA}_{\textbf{x}^{(i)}}(\tilde{\textbf{x}}_S)), \dots, \frac{1}{n}\sum_{i = 1}^n \mathbb{1}_{k}(\text{hIDEA}_{\textbf{x}^{(i)}}(\tilde{\textbf{x}}_S))\right)

where the c-th vector element is the fraction of hard label reassignments to the c-th cluster.

Public fields

predictor

ClustPredictor
The object (created with ClustPredictor$new()) holding the cluster algorithm and the data.

feature

(⁠character or list⁠)
Features/ feature sets to calculate the effect curves.

method

character(1)
The IDEA method to be used.

mg

DataGenerator
A MarginalGenerator object to sample and generate the pseudo instances.

results

data.table
The IDEA results.

noise.out

any
Indicator for the noise variable.

Active bindings

type

function
Detect the type in the predictor

Methods

Public methods


Method new()

Create an IDEA object.

Usage
IDEA$new(predictor, feature, method = "g+l", grid.size = 20L, noise.out = NULL)
Arguments
predictor

ClustPredictor
The object (created with ClustPredictor$new()) holding the cluster algorithm and the data.

feature

(⁠character or list⁠)
For which features do you want importance scores calculated. The default value of NULL implies all features. Use a named list of character vectors to define groups of features for which joint importance will be calculated.

method

character(1)
The IDEA method to be used. Possible choices for the method are:
"g+l" (default): store global and local IDEA results

"local": store only local IDEA results

"global": store only global IDEA results

"init_local": store only local IDEA results and additional reference for the observations initial assigned cluster.

"init_g+l" store global and local IDEA results and additional reference for the observations initial assigned cluster.

grid.size

⁠(numeric(1) or NULL)⁠
size of the grid to replace values. If grid size is given, an equidistant grid is create. If NULL, values are calculated at all present combinations of feature values.

noise.out

any
Indicator for the noise variable. If not NULL, noise will be excluded from the effect estimation.

Returns

(data.frame)
Values for the effect curves:
One row per grid per instance for each local idea estimation. If method includes global estimation, one additional row per grid point.


Method plot()

Plot an IDEA object.

Usage
IDEA$plot(c = NULL)
Arguments
c

indicator for the cluster to plot. If NULL, all clusters are plotted.

Returns

(ggplot)
A ggplot object that depends on the method chosen.


Method plot_globals()

Plot the global sIDEA curves of all clusters.

Usage
IDEA$plot_globals(mass = NULL)
Arguments
mass

between 0 and 1. The percentage of local IDEA curves to plot a certainty interval.

Returns

(ggplot)
A ggplot object.


Method clone()

The objects of this class are cloneable with this method.

Usage
IDEA$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

See Also

iml::FeatureEffects, iml::FeatureEffects

Examples


# load data and packages
require(factoextra)
require(FuzzyDBScan)
multishapes = as.data.frame(multishapes[, 1:2])
# Set up an train FuzzyDBScan
eps = c(0, 0.2)
pts = c(3, 15)
res = FuzzyDBScan$new(multishapes, eps, pts)
res$plot("x", "y")
# create soft label predictor
predict_prob = function(model, newdata) model$predict(new_data = newdata)
predictor = ClustPredictor$new(res, as.data.frame(multishapes), y = res$results,
                                    predict.function = predict_prob, type = "prob")
# Calculate `IDEA` global and local for feature "x"
idea_x = IDEA$new(predictor = predictor, feature = "x", grid.size = 5)
idea_x$plot_globals(0.5) # plot global effect of all clusters with 50 percent of local mass.


[Package FACT version 0.1.1 Index]