R: Plot Pareto front

h2o.pareto_front {h2o}

R Documentation

Plot Pareto front

Description

Create Pareto front and plot it. Pareto front contains models that are optimal in a sense that for each model in the Pareto front there isn't a model that would be better in both criteria. For example, this can be useful in picking models that are fast to predict and at the same time have high accuracy. For generic data.frames/H2OFrames input the task is assumed to be minimization for both metrics.

Usage

h2o.pareto_front(
  object,
  leaderboard_frame = NULL,
  x_metric = c("AUTO", "AUC", "AUCPR", "logloss", "MAE", "mean_per_class_error",
    "mean_residual_deviance", "MSE", "predict_time_per_row_ms", "RMSE", "RMSLE",
    "training_time_ms"),
  y_metric = c("AUTO", "AUC", "AUCPR", "logloss", "MAE", "mean_per_class_error",
    "mean_residual_deviance", "MSE", "predict_time_per_row_ms", "RMSE", "RMSLE",
    "training_time_ms"),
  optimum = c("AUTO", "top left", "top right", "bottom left", "bottom right"),
  title = NULL,
  color_col = "algo"
)

Arguments

`object`	H2OAutoML or H2OGrid or a data.frame
`leaderboard_frame`	a frame used for generating the leaderboard (used when `object` is not a frame)
`x_metric`	one of the metrics present in the leaderboard
`y_metric`	one of the metrics present in the leaderboard
`optimum`	location of the optimum on XY plane
`title`	title used for plotting
`color_col`	categorical column in the leaderboard that should be used for coloring the points

Value

An H2OParetoFront S4 object with plot method and 'pareto_front“ slot

Examples

## Not run: 
library(h2o)
h2o.init()

# Import the wine dataset into H2O:
df <-  h2o.importFile("h2o://prostate.csv")

# Set the response
response <- "CAPSULE"
df[[response]] <- as.factor(df[[response]])

# Split the dataset into a train and test set:
splits <- h2o.splitFrame(df, ratios = 0.8, seed = 1)
train <- splits[[1]]
test <- splits[[2]]

# Build and train the model:
aml <- h2o.automl(y = response,
                  training_frame = train,
                  max_models = 10,
                  seed = 1)

# Create the Pareto front
pf <- h2o.pareto_front(aml)
plot(pf)
pf@pareto_front # to retrieve the Pareto front subset of the leaderboard

aml2 <- h2o.automl(y = response,
                   training_frame = train,
                   max_models = 10,
                   seed = 42)

combined_leaderboard <- h2o.make_leaderboard(list(aml, aml2), test, extra_columns = "ALL")
pf_combined <- h2o.pareto_front(combined_leaderboard, x_metric = "predict_time_per_row_ms",
                                y_metric = "rmse", optimum = "bottom left")
plot(pf_combined)
pf_combined@pareto_front

## End(Not run)

[Package h2o version 3.44.0.3 Index]