R: Hyperparameter Optimization with Successive Halving

mlr_optimizers_successive_halving {mlr3hyperband}

R Documentation

Hyperparameter Optimization with Successive Halving

Description

Optimizer using the Successive Halving Algorithm (SHA). SHA is initialized with the number of starting configurations n, the proportion of configurations discarded in each stage eta, and the minimum r_min and maximum ⁠_max⁠ budget of a single evaluation. The algorithm starts by sampling n random configurations and allocating the minimum budget r_min to them. The configurations are evaluated and 1 / eta of the worst-performing configurations are discarded. The remaining configurations are promoted to the next stage and evaluated on a larger budget. The following table is the stage layout for eta = 2, r_min = 1 and r_max = 8.

`i`	`n_i`	`r_i`
0	8	1
1	4	2
2	2	4
3	1	8

i is the stage number, n_i is the number of configurations and r_i is the budget allocated to a single configuration.

The number of stages is calculated so that each stage consumes approximately the same budget. This sometimes results in the minimum budget having to be slightly adjusted by the algorithm.

Resources

The gallery features a collection of case studies and demos about optimization.

Tune the hyperparameters of XGBoost with Hyperband (Hyperband can be easily swapped with SHA).
Use data subsampling and Hyperband to optimize a support vector machine.

Dictionary

This bbotk::Optimizer can be instantiated via the dictionary bbotk::mlr_optimizers or with the associated sugar function bbotk::opt():

mlr_optimizers$get("successive_halving")
opt("successive_halving")

Parameters

n: integer(1)
Number of configurations in the base stage.
eta: numeric(1)
With every stage, the budget is increased by a factor of eta and only the best 1 / eta configurations are promoted to the next stage. Non-integer values are supported, but eta is not allowed to be less or equal to 1.
sampler: paradox::Sampler
Object defining how the samples of the parameter space should be drawn. The default is uniform sampling.
repetitions: integer(1)
If 1 (default), optimization is stopped once all stages are evaluated. Otherwise, optimization is stopped after repetitions runs of SHA. The bbotk::Terminator might stop the optimization before all repetitions are executed.
adjust_minimum_budget: logical(1)
If TRUE, the minimum budget is increased so that the last stage uses the maximum budget defined in the search space.

Custom Sampler

Hyperband supports custom paradox::Sampler object for initial configurations in each bracket. A custom sampler may look like this (the full example is given in the examples section):

# - beta distribution with alpha = 2 and beta = 5
# - categorical distribution with custom probabilities
sampler = SamplerJointIndep$new(list(
  Sampler1DRfun$new(params[[2]], function(n) rbeta(n, 2, 5)),
  Sampler1DCateg$new(params[[3]], prob = c(0.2, 0.3, 0.5))
))

Progress Bars

⁠$optimize()⁠ supports progress bars via the package progressr combined with a bbotk::Terminator. Simply wrap the function in progressr::with_progress() to enable them. We recommend to use package progress as backend; enable with progressr::handlers("progress").

Logging

Hyperband uses a logger (as implemented in lgr) from package bbotk. Use lgr::get_logger("bbotk") to access and control the logger.

Super classes

bbotk::Optimizer -> bbotk::OptimizerBatch -> OptimizerBatchSuccessiveHalving

Methods

Public methods

OptimizerBatchSuccessiveHalving$new()
OptimizerBatchSuccessiveHalving$clone()

Inherited methods

Method `new()`

Creates a new instance of this R6 class.

Usage

OptimizerBatchSuccessiveHalving$new()

Method `clone()`

The objects of this class are cloneable with this method.

Usage

OptimizerBatchSuccessiveHalving$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Source

Jamieson K, Talwalkar A (2016). “Non-stochastic Best Arm Identification and Hyperparameter Optimization.” In Gretton A, Robert CC (eds.), Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, volume 51 series Proceedings of Machine Learning Research, 240-248. http://proceedings.mlr.press/v51/jamieson16.html.

Examples

library(bbotk)
library(data.table)

# set search space
search_space = domain = ps(
  x1 = p_dbl(-5, 10),
  x2 = p_dbl(0, 15),
  fidelity = p_dbl(1e-2, 1, tags = "budget")
)

# Branin function with fidelity, see `bbotk::branin()`
fun = function(xs) branin_wu(xs[["x1"]], xs[["x2"]], xs[["fidelity"]])

# create objective
objective = ObjectiveRFun$new(
  fun = fun,
  domain = domain,
  codomain = ps(y = p_dbl(tags = "minimize"))
)

# initialize instance and optimizer
instance = OptimInstanceSingleCrit$new(
  objective = objective,
  search_space = search_space,
  terminator = trm("evals", n_evals = 50)
)

optimizer = opt("successive_halving")

# optimize branin function
optimizer$optimize(instance)

# best scoring evaluation
instance$result

# all evaluations
as.data.table(instance$archive)

[Package mlr3hyperband version 0.6.0 Index]

Hyperparameter Optimization with Successive Halving

Description

Resources

Dictionary

Parameters

Archive

Custom Sampler

Progress Bars

Logging

Super classes

Methods

Public methods

Method new()

Usage

Method clone()

Usage

Arguments

Source

Examples

Method `new()`

Method `clone()`