calculate_variable_splits {ceterisParibus} | R Documentation |
Calculate Split Points for Selected Variables
Description
This function calculate candidate splits for each selected variable. For numerical variables splits are calculated as percentiles (in general uniform quantiles of the length grid_points). For all other variables splits are calculated as unique values.
Usage
calculate_variable_splits(data, variables = colnames(data), grid_points = 101)
Arguments
data |
validation dataset. Is used to determine distribution of observations. |
variables |
names of variables for which splits shall be calculated |
grid_points |
number of points used for response path |
Details
Note that calculate_variable_splits
function is S3 generic.
If you want to work on non standard data sources (like H2O ddf, external databases)
you should overload it.
Value
A named list with splits for selected variables
Examples
library("DALEX")
## Not run:
library("randomForest")
set.seed(59)
apartments_rf_model <- randomForest(m2.price ~ construction.year + surface + floor +
no.rooms + district, data = apartments)
vars <- c("construction.year", "surface", "floor", "no.rooms", "district")
calculate_variable_splits(apartments, vars)
## End(Not run)
[Package ceterisParibus version 0.4.2 Index]