| IDEA {FACT} | R Documentation |
Idea - Isolated Effect on Assignment
Description
IDEA with a soft label predictor (sIDEA)
tacks changes the soft label of being assigned to each existing cluster
throughout a (multidimensional) feature space
IDEA with a hard label predictor (hIDEA)
tacks changes the soft label of being assigned to each existing cluster
throughout a (multidimensional) feature space
Details
IDEA for soft labeling algorithms (sIDEA) indicates the soft label that an
observation \textbf{x} with replaced values \tilde{\textbf{x}}_S is assigned to
the k-th cluster. IDEA for hard labeling algorithms (hIDEA) indicates
the cluster assignment of an observation \textbf{x} with replaced values
\tilde{\textbf{x}}_S.
The global IDEA is denoted by the corresponding data set X:
\text{sIDEA}_X(\tilde{\textbf{x}}_S) = \left(\frac{1}{n} \sum_{i = 1}^n
\text{sIDEA}^{(1)}_{\textbf{x}^{(i)}}(\tilde{\textbf{x}}_S), \dots, \frac{1}{n}
\sum_{i = 1}^n \text{sIDEA}^{(k)}_{\textbf{x}^{(i)}}(\tilde{\textbf{x}}_S) \right)
where the c-th vector element is the average c-th vector element of local sIDEA functions. The global hIDEA corresponds to:
\text{hIDEA}_X(\tilde{\textbf{x}}_S) = \left(\frac{1}{n}\sum_{i = 1}^n
\mathbb{1}_{1}(\text{hIDEA}_{\textbf{x}^{(i)}}(\tilde{\textbf{x}}_S)), \dots,
\frac{1}{n}\sum_{i = 1}^n \mathbb{1}_{k}(\text{hIDEA}_{\textbf{x}^{(i)}}(\tilde{\textbf{x}}_S))\right)
where the c-th vector element is the fraction of hard label reassignments to the c-th cluster.
Public fields
predictorClustPredictor
The object (created withClustPredictor$new()) holding the cluster algorithm and the data.feature(
character or list)
Features/ feature sets to calculate the effect curves.methodcharacter(1)
TheIDEAmethod to be used.mgDataGenerator
AMarginalGeneratorobject to sample and generate the pseudo instances.resultsdata.table
TheIDEAresults.noise.outany
Indicator for the noise variable.
Active bindings
typefunction
Detect the type in the predictor
Methods
Public methods
Method new()
Create an IDEA object.
Usage
IDEA$new(predictor, feature, method = "g+l", grid.size = 20L, noise.out = NULL)
Arguments
predictorClustPredictor
The object (created withClustPredictor$new()) holding the cluster algorithm and the data.feature(
character or list)
For which features do you want importance scores calculated. The default value ofNULLimplies all features. Use a named list of character vectors to define groups of features for which joint importance will be calculated.methodcharacter(1)
TheIDEAmethod to be used. Possible choices for the method are:
"g+l"(default): store global and localIDEAresults"local": store only localIDEAresults"global": store only globalIDEAresults"init_local": store only localIDEAresults and additional reference for the observations initial assigned cluster."init_g+l"store global and localIDEAresults and additional reference for the observations initial assigned cluster.grid.size(numeric(1) or NULL)
size of the grid to replace values. If grid size is given, an equidistant grid is create. IfNULL, values are calculated at all present combinations of feature values.noise.outany
Indicator for the noise variable. If not NULL, noise will be excluded from the effect estimation.
Returns
(data.frame)
Values for the effect curves:
One row per grid per instance for each local idea
estimation. If method includes global estimation, one
additional row per grid point.
Method plot()
Plot an IDEA object.
Usage
IDEA$plot(c = NULL)
Arguments
cindicator for the cluster to plot. If
NULL, all clusters are plotted.
Returns
(ggplot)
A ggplot object that depends on the method chosen.
Method plot_globals()
Plot the global sIDEA curves of all clusters.
Usage
IDEA$plot_globals(mass = NULL)
Arguments
massbetween 0 and 1. The percentage of local
IDEAcurves to plot a certainty interval.
Returns
(ggplot)
A ggplot object.
Method clone()
The objects of this class are cloneable with this method.
Usage
IDEA$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
iml::FeatureEffects, iml::FeatureEffects
Examples
# load data and packages
require(factoextra)
require(FuzzyDBScan)
multishapes = as.data.frame(multishapes[, 1:2])
# Set up an train FuzzyDBScan
eps = c(0, 0.2)
pts = c(3, 15)
res = FuzzyDBScan$new(multishapes, eps, pts)
res$plot("x", "y")
# create soft label predictor
predict_prob = function(model, newdata) model$predict(new_data = newdata)
predictor = ClustPredictor$new(res, as.data.frame(multishapes), y = res$results,
predict.function = predict_prob, type = "prob")
# Calculate `IDEA` global and local for feature "x"
idea_x = IDEA$new(predictor = predictor, feature = "x", grid.size = 5)
idea_x$plot_globals(0.5) # plot global effect of all clusters with 50 percent of local mass.