light_profile2d {flashlight} | R Documentation |
2D Partial Dependence and other 2D Profiles
Description
Calculates different types of 2D-profiles across two variables. By default, partial dependence profiles are calculated (see Friedman). Other options are response, predicted values, residuals, and shap. The results are aggregated by (weighted) means.
Usage
light_profile2d(x, ...)
## Default S3 method:
light_profile2d(x, ...)
## S3 method for class 'flashlight'
light_profile2d(
x,
v = NULL,
data = NULL,
by = x$by,
type = c("partial dependence", "predicted", "response", "residual", "shap"),
breaks = NULL,
n_bins = 11L,
cut_type = "equal",
use_linkinv = TRUE,
counts = TRUE,
counts_weighted = FALSE,
pd_evaluate_at = NULL,
pd_grid = NULL,
pd_indices = NULL,
pd_n_max = 1000L,
pd_seed = NULL,
...
)
## S3 method for class 'multiflashlight'
light_profile2d(
x,
v = NULL,
data = NULL,
type = c("partial dependence", "predicted", "response", "residual", "shap"),
breaks = NULL,
n_bins = 11L,
cut_type = "equal",
pd_evaluate_at = NULL,
pd_grid = NULL,
...
)
Arguments
x |
An object of class "flashlight" or "multiflashlight". |
... |
Further arguments passed to |
v |
A vector of exactly two variable names to be profiled. |
data |
An optional |
by |
An optional vector of column names used to additionally group the results. |
type |
Type of the profile: Either "partial dependence", "predicted", "response", "residual", or "shap". |
breaks |
Named list of cut breaks specifying how to bin one or more numeric
variables. Used to overwrite automatic binning via |
n_bins |
Approximate number of unique values to evaluate for numeric |
cut_type |
Should numeric |
use_linkinv |
Should retransformation function be applied?
Default is |
counts |
Should observation counts be added? |
counts_weighted |
If |
pd_evaluate_at |
An named list of evaluation points for one or more variables. Only relevant for type = "partial dependence". |
pd_grid |
An evaluation |
pd_indices |
A vector of row numbers to consider in calculating partial dependence profiles. Only used for type = "partial dependence". |
pd_n_max |
Maximum number of ICE profiles to calculate
(will be randomly picked from |
pd_seed |
Integer random seed used to select ICE profiles. Only used for type = "partial dependence". |
Details
Different binning options are available, see arguments below.
For high resolution partial dependence plots, it might be necessary to specify
breaks
, pd_evaluate_at
or pd_grid
in order to avoid empty parts
in the plot. A high value of n_bins
might not have the desired effect as it
internally capped at the number of distinct values of a variable.
For partial dependence and prediction profiles, "model", "predict_function", "linkinv" and "data" are required. For response profiles it is "y", "linkinv" and "data" and for shap profiles it is just "shap". "data" can be passed on the fly.
Value
An object of class "light_profile2d" with the following elements:
-
data
A tibble containing results. Can be used to build fully customized visualizations. Column names can be controlled byoptions(flashlight.column_name)
. -
by
Names of group by variables. -
v
The two variable names evaluated. -
type
Same as inputtype
. For information only.
Methods (by class)
-
light_profile2d(default)
: Default method not implemented yet. -
light_profile2d(flashlight)
: 2D profiles for flashlight. -
light_profile2d(multiflashlight)
: 2D profiles for multiflashlight.
References
Friedman J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29:1189–1232.
See Also
light_profile()
, plot.light_profile2d()
Examples
fit <- lm(Sepal.Length ~ ., data = iris)
fl <- flashlight(model = fit, label = "iris", data = iris, y = "Sepal.Length")
light_profile2d(fl, v = c("Petal.Length", "Species"))