R: Bayesian Optimization with Gaussian Processes

bayesOpt {ParBayesianOptimization}

R Documentation

Bayesian Optimization with Gaussian Processes

Description

Maximizes a user defined function within a set of bounds. After the function is sampled a pre-determined number of times, a Gaussian process is fit to the results. An acquisition function is then maximized to determine the most likely location of the global maximum of the user defined function. This process is repeated for a set number of iterations.

Usage

bayesOpt(
  FUN,
  bounds,
  saveFile = NULL,
  initGrid,
  initPoints = 4,
  iters.n = 3,
  iters.k = 1,
  otherHalting = list(timeLimit = Inf, minUtility = 0),
  acq = "ucb",
  kappa = 2.576,
  eps = 0,
  parallel = FALSE,
  gsPoints = pmax(100, length(bounds)^3),
  convThresh = 1e+08,
  acqThresh = 1,
  errorHandling = "stop",
  plotProgress = FALSE,
  verbose = 1,
  ...
)

Arguments

`FUN`	the function to be maximized. This function should return a named list with at least 1 component. The first component must be named `Score` and should contain the metric to be maximized. You may return other named scalar elements that you wish to include in the final summary table.
`bounds`	named list of lower and upper bounds for each `FUN` input. The names of the list should be arguments passed to `FUN`. Use "L" suffix to indicate integers.
`saveFile`	character filepath (including file name and extension, .RDS) that specifies the location to save results as they are obtained. A `bayesOpt` object is saved to the file after each epoch.
`initGrid`	user specified points to sample the scoring function, should be a `data.frame` or `data.table` with identical column names as bounds.
`initPoints`	Number of points to initialize the process with. Points are chosen with latin hypercube sampling within the bounds supplied.
`iters.n`	The total number of times FUN will be run after initialization.
`iters.k`	integer that specifies the number of times to sample FUN at each Epoch (optimization step). If running in parallel, good practice is to set `iters.k` to some multiple of the number of cores you have designated for this process. Must be lower than, and preferrably some multiple of `iters.n`.
`otherHalting`	A list of other halting specifications. The process will stop if any of the following is true. These checks are only performed in between optimization steps: The elapsed seconds is greater than the list element `timeLimit`. The utility expected from the Gaussian process is less than the list element `minUtility`.
`acq`	acquisition function type to be used. Can be "ucb", "ei", "eips" or "poi". `ucb` Upper Confidence Bound `ei` Expected Improvement `eips` Expected Improvement Per Second `poi` Probability of Improvement
`kappa`	tunable parameter kappa of the upper confidence bound. Adjusts exploitation/exploration. Increasing kappa will increase the importance that uncertainty (unexplored space) has, therefore incentivising exploration. This number represents the standard deviations above 0 of your upper confidence bound. Default is 2.56, which corresponds to the ~99th percentile.
`eps`	tunable parameter epsilon of ei, eips and poi. Adjusts exploitation/exploration. This value is added to y_max after the scaling, so should between -0.1 and 0.1. Increasing eps will make the "improvement" threshold for new points higher, therefore incentivising exploitation.
`parallel`	should the process run in parallel? If TRUE, several criteria must be met: A parallel backend must be registered Objects required by `FUN` must be loaded into each cluster. Packages required by `FUN` must be loaded into each cluster. See vignettes. `FUN` must be thread safe.
`gsPoints`	integer that specifies how many initial points to try when searching for the optimum of the acquisition function. Increase this for a higher chance to find global optimum, at the expense of more time.
`convThresh`	convergence threshold passed to `factr` when the `optim` function (L-BFGS-B) is called. Lower values will take longer to converge, but may be more accurate.
`acqThresh`	number 0-1. Represents the minimum percentage of the global optimal utility required for a local optimum to be included as a candidate parameter set in the next scoring function. If 1.0, only the global optimum will be used as a candidate parameter set. If 0.5, only local optimums with 50 percent of the utility of the global optimum will be used.
`errorHandling`	If FUN returns an error, how to proceed. All errors are stored in `scoreSummary`. Can be one of 3 options: "stop" stops the function running and returns results. "continue" keeps the process running. Passing an integer will allow the process to continue until that many errors have occured, after which the results will be returned.
`plotProgress`	Should the progress of the Bayesian optimization be printed? Top graph shows the score(s) obtained at each iteration. The bottom graph shows the estimated utility of each point. This is useful to display how much utility the Gaussian Process is assuming still exists. If your utility is approaching 0, then you can be confident you are close to an optimal parameter set.
`verbose`	Whether or not to print progress to the console. If 0, nothing will be printed. If 1, progress will be printed. If 2, progress and information about new parameter-score pairs will be printed.
`...`	Other parameters passed to `DiceKriging::km()`. All FUN inputs and scores are scaled from 0-1 before being passed to km. FUN inputs are scaled within `bounds`, and scores are scaled by 0 = min(scores), 1 = max(scores).

Value

An object of class bayesOpt containing information about the process.

FUN The scoring function.
bounds The bounds originally supplied.
iters The total iterations that have been run.
initPars The initialization parameters.
optPars The optimization parameters.
GauProList A list containing information on the Gaussian Processes used in optimization.
scoreSummary A data.table with results from the execution of FUN at different inputs. Includes information on the epoch, iteration, function inputs, score, and any other information returned by FUN.
stopStatus Information on what caused the function to stop running. Possible explenations are time limit, minimum utility not met, errors in FUN, iters.n was reached, or the Gaussian Process encountered an error.
elapsedTime The total time in seconds the function was executing.

Vignettes

It is highly recommended to read the GitHub for examples. There are also several vignettes available from the official CRAN Listing.

References

Jasper Snoek, Hugo Larochelle, Ryan P. Adams (2012) Practical Bayesian Optimization of Machine Learning Algorithms

Examples

# Example 1 - Optimization of a continuous single parameter function
scoringFunction <- function(x) {
  a <- exp(-(2-x)^2)*1.5
  b <- exp(-(4-x)^2)*2
  c <- exp(-(6-x)^2)*1
  return(list(Score = a+b+c))
}

bounds <- list(x = c(0,8))

Results <- bayesOpt(
    FUN = scoringFunction
  , bounds = bounds
  , initPoints = 3
  , iters.n = 2
  , gsPoints = 10
)

## Not run: 
# Example 2 - Hyperparameter Tuning in xgboost
if (requireNamespace('xgboost', quietly = TRUE)) {
  library("xgboost")

  data(agaricus.train, package = "xgboost")

  Folds <- list(
      Fold1 = as.integer(seq(1,nrow(agaricus.train$data),by = 3))
    , Fold2 = as.integer(seq(2,nrow(agaricus.train$data),by = 3))
    , Fold3 = as.integer(seq(3,nrow(agaricus.train$data),by = 3))
  )

  scoringFunction <- function(max_depth, min_child_weight, subsample) {

    dtrain <- xgb.DMatrix(agaricus.train$data,label = agaricus.train$label)

    Pars <- list(
        booster = "gbtree"
      , eta = 0.01
      , max_depth = max_depth
      , min_child_weight = min_child_weight
      , subsample = subsample
      , objective = "binary:logistic"
      , eval_metric = "auc"
    )

    xgbcv <- xgb.cv(
         params = Pars
       , data = dtrain
       , nround = 100
       , folds = Folds
       , prediction = TRUE
       , showsd = TRUE
       , early_stopping_rounds = 5
       , maximize = TRUE
       , verbose = 0
    )

    return(
      list(
          Score = max(xgbcv$evaluation_log$test_auc_mean)
        , nrounds = xgbcv$best_iteration
      )
    )
  }

  bounds <- list(
      max_depth = c(2L, 10L)
    , min_child_weight = c(1, 100)
    , subsample = c(0.25, 1)
  )

  ScoreResult <- bayesOpt(
      FUN = scoringFunction
    , bounds = bounds
    , initPoints = 3
    , iters.n = 2
    , iters.k = 1
    , acq = "ei"
    , gsPoints = 10
    , parallel = FALSE
    , verbose = 1
  )
}

## End(Not run)

[Package ParBayesianOptimization version 1.2.6 Index]