h2o.pd_multi_plot {h2o}R Documentation

Plot partial dependencies for a variable across multiple models

Description

Partial dependence plot (PDP) gives a graphical depiction of the marginal effect of a variable on the response. The effect of a variable is measured in change in the mean response. PDP assumes independence between the feature for which is the PDP computed and the rest.

Usage

h2o.pd_multi_plot(
  object,
  newdata,
  column,
  best_of_family = TRUE,
  target = NULL,
  row_index = NULL,
  max_levels = 30,
  show_rug = TRUE
)

Arguments

object

Either a list of H2O models/model_ids or an H2OAutoML object.

newdata

An H2OFrame.

column

A feature column name to inspect. Character string.

best_of_family

If TRUE, plot only the best model of each algorithm family; if FALSE, plot all models. Defaults to TRUE.

target

If multinomial, plot PDP just for target category.

row_index

Optional. Calculate Individual Conditional Expectation (ICE) for row, row_index. Integer.

max_levels

An integer specifying the maximum number of factor levels to show. Defaults to 30.

show_rug

Show rug to visualize the density of the column. Defaults to TRUE.

Value

A ggplot2 object

Examples

## Not run: 
library(h2o)
h2o.init()

# Import the wine dataset into H2O:
f <- "https://h2o-public-test-data.s3.amazonaws.com/smalldata/wine/winequality-redwhite-no-BOM.csv"
df <-  h2o.importFile(f)

# Set the response
response <- "quality"

# Split the dataset into a train and test set:
splits <- h2o.splitFrame(df, ratios = 0.8, seed = 1)
train <- splits[[1]]
test <- splits[[2]]

# Build and train the model:
aml <- h2o.automl(y = response,
                  training_frame = train,
                  max_models = 10,
                  seed = 1)

# Create the partial dependence plot
pdp <- h2o.pd_multi_plot(aml, test, column = "alcohol")
print(pdp)

## End(Not run)

[Package h2o version 3.44.0.3 Index]