shap.plot.dependence {SHAPforxgboost} | R Documentation |
SHAP dependence plot and interaction plot, optional to be colored by a selected feature
Description
This function by default makes a simple dependence plot with feature values
on the x-axis and SHAP values on the y-axis, optional to color by another
feature. It is optional to use a different variable for SHAP values on the
y-axis, and color the points by the feature value of a designated variable.
Not colored if color_feature
is not supplied. If data_int
(the
SHAP interaction values dataset) is supplied, it will plot the interaction
effect between y
and x
on the y-axis. Dependence plot is easy
to make if you have the SHAP values dataset from predict.xgb.Booster
or predict.lgb.Booster
.
It is not necessary to start with the long format data, but since that is
used for the summary plot, we just continue to use it here.
Usage
shap.plot.dependence(
data_long,
x,
y = NULL,
color_feature = NULL,
data_int = NULL,
dilute = FALSE,
smooth = TRUE,
size0 = NULL,
add_hist = FALSE,
add_stat_cor = FALSE,
alpha = NULL,
jitter_height = 0,
jitter_width = 0,
...
)
Arguments
data_long |
the long format SHAP values from |
x |
which feature to show on x-axis, it will plot the feature value |
y |
which shap values to show on y-axis, it will plot the SHAP value of that feature. y is default to x, if y is not provided, just plot the SHAP values of x on the y-axis |
color_feature |
which feature value to use for coloring, color by the feature value. If "auto", will select the feature "c" minimizing the variance of the shap value given x and c, which can be viewed as a heuristic for the strongest interaction. |
data_int |
the 3-dimention SHAP interaction values array. if |
dilute |
a number or logical, dafault to TRUE, will plot
|
smooth |
optional to add a loess smooth line, default to TRUE. |
size0 |
point size, default to 1 if nobs<1000, 0.4 if nobs>1000 |
add_hist |
whether to add histogram using |
add_stat_cor |
add correlation and p-value from |
alpha |
point transparancy, default to 1 if nobs<1000 else 0.6 |
jitter_height |
amount of vertical jitter (see hight in |
jitter_width |
amount of horizontal jitter (see width in |
... |
additional parameters passed to |
Value
be default a ggplot2
object, based on which you could add more geom
layers.
Examples
# **SHAP dependence plot**
# 1. simple dependence plot with SHAP values of x on the y axis
shap.plot.dependence(data_long = shap_long_iris, x="Petal.Length",
add_hist = TRUE, add_stat_cor = TRUE)
# 2. can choose a different SHAP values on the y axis
shap.plot.dependence(data_long = shap_long_iris, x="Petal.Length",
y = "Petal.Width")
# 3. color by another feature's feature values
shap.plot.dependence(data_long = shap_long_iris, x="Petal.Length",
color_feature = "Petal.Width")
# 4. choose 3 different variables for x, y, and color
shap.plot.dependence(data_long = shap_long_iris, x="Petal.Length",
y = "Petal.Width", color_feature = "Petal.Width")
# Optional to add hist or remove smooth line, optional to plot fewer data (make plot quicker)
shap.plot.dependence(data_long = shap_long_iris, x="Petal.Length",
y = "Petal.Width", color_feature = "Petal.Width",
add_hist = TRUE, smooth = FALSE, dilute = 3)
# to make a list of plot
plot_list <- lapply(names(iris)[2:3], shap.plot.dependence, data_long = shap_long_iris)
# **SHAP interaction effect plot **
# To get the interaction SHAP dataset for plotting, need to get `shap_int` first:
mod1 = xgboost::xgboost(
data = as.matrix(iris[,-5]), label = iris$Species,
gamma = 0, eta = 1, lambda = 0,nrounds = 1, verbose = FALSE, nthread = 1)
# Use either:
data_int <- shap.prep.interaction(xgb_mod = mod1,
X_train = as.matrix(iris[,-5]))
# or:
shap_int <- predict(mod1, as.matrix(iris[,-5]),
predinteraction = TRUE)
# if data_int is supplied, y axis will plot the interaction values of y (vs. x)
shap.plot.dependence(data_long = shap_long_iris,
data_int = shap_int_iris,
x="Petal.Length",
y = "Petal.Width",
color_feature = "Petal.Width")