IDEA {FACT} | R Documentation |
Idea - Isolated Effect on Assignment
Description
IDEA
with a soft label predictor (sIDEA)
tacks changes the soft label of being assigned to each existing cluster
throughout a (multidimensional) feature space
IDEA
with a hard label predictor (hIDEA)
tacks changes the soft label of being assigned to each existing cluster
throughout a (multidimensional) feature space
Details
IDEA
for soft labeling algorithms (sIDEA) indicates the soft label that an
observation \textbf{x}
with replaced values \tilde{\textbf{x}}_S
is assigned to
the k-th cluster. IDEA
for hard labeling algorithms (hIDEA) indicates
the cluster assignment of an observation \textbf{x}
with replaced values
\tilde{\textbf{x}}_S
.
The global IDEA
is denoted by the corresponding data set X:
\text{sIDEA}_X(\tilde{\textbf{x}}_S) = \left(\frac{1}{n} \sum_{i = 1}^n
\text{sIDEA}^{(1)}_{\textbf{x}^{(i)}}(\tilde{\textbf{x}}_S), \dots, \frac{1}{n}
\sum_{i = 1}^n \text{sIDEA}^{(k)}_{\textbf{x}^{(i)}}(\tilde{\textbf{x}}_S) \right)
where the c-th vector element is the average c-th vector element of local sIDEA functions. The global hIDEA corresponds to:
\text{hIDEA}_X(\tilde{\textbf{x}}_S) = \left(\frac{1}{n}\sum_{i = 1}^n
\mathbb{1}_{1}(\text{hIDEA}_{\textbf{x}^{(i)}}(\tilde{\textbf{x}}_S)), \dots,
\frac{1}{n}\sum_{i = 1}^n \mathbb{1}_{k}(\text{hIDEA}_{\textbf{x}^{(i)}}(\tilde{\textbf{x}}_S))\right)
where the c-th vector element is the fraction of hard label reassignments to the c-th cluster.
Public fields
predictor
ClustPredictor
The object (created withClustPredictor$new()
) holding the cluster algorithm and the data.feature
(
character or list
)
Features/ feature sets to calculate the effect curves.method
character(1)
TheIDEA
method to be used.mg
DataGenerator
AMarginalGenerator
object to sample and generate the pseudo instances.results
data.table
TheIDEA
results.noise.out
any
Indicator for the noise variable.
Active bindings
type
function
Detect the type in the predictor
Methods
Public methods
Method new()
Create an IDEA object.
Usage
IDEA$new(predictor, feature, method = "g+l", grid.size = 20L, noise.out = NULL)
Arguments
predictor
ClustPredictor
The object (created withClustPredictor$new()
) holding the cluster algorithm and the data.feature
(
character or list
)
For which features do you want importance scores calculated. The default value ofNULL
implies all features. Use a named list of character vectors to define groups of features for which joint importance will be calculated.method
character(1)
TheIDEA
method to be used. Possible choices for the method are:
"g+l"
(default): store global and localIDEA
results"local"
: store only localIDEA
results"global"
: store only globalIDEA
results"init_local"
: store only localIDEA
results and additional reference for the observations initial assigned cluster."init_g+l"
store global and localIDEA
results and additional reference for the observations initial assigned cluster.grid.size
(numeric(1) or NULL)
size of the grid to replace values. If grid size is given, an equidistant grid is create. IfNULL
, values are calculated at all present combinations of feature values.noise.out
any
Indicator for the noise variable. If not NULL, noise will be excluded from the effect estimation.
Returns
(data.frame)
Values for the effect curves:
One row per grid per instance for each local idea
estimation. If method
includes global estimation, one
additional row per grid point.
Method plot()
Plot an IDEA object.
Usage
IDEA$plot(c = NULL)
Arguments
c
indicator for the cluster to plot. If
NULL
, all clusters are plotted.
Returns
(ggplot)
A ggplot object that depends on the method
chosen.
Method plot_globals()
Plot the global sIDEA curves of all clusters.
Usage
IDEA$plot_globals(mass = NULL)
Arguments
mass
between 0 and 1. The percentage of local
IDEA
curves to plot a certainty interval.
Returns
(ggplot)
A ggplot object.
Method clone()
The objects of this class are cloneable with this method.
Usage
IDEA$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
See Also
iml::FeatureEffects, iml::FeatureEffects
Examples
# load data and packages
require(factoextra)
require(FuzzyDBScan)
multishapes = as.data.frame(multishapes[, 1:2])
# Set up an train FuzzyDBScan
eps = c(0, 0.2)
pts = c(3, 15)
res = FuzzyDBScan$new(multishapes, eps, pts)
res$plot("x", "y")
# create soft label predictor
predict_prob = function(model, newdata) model$predict(new_data = newdata)
predictor = ClustPredictor$new(res, as.data.frame(multishapes), y = res$results,
predict.function = predict_prob, type = "prob")
# Calculate `IDEA` global and local for feature "x"
idea_x = IDEA$new(predictor = predictor, feature = "x", grid.size = 5)
idea_x$plot_globals(0.5) # plot global effect of all clusters with 50 percent of local mass.