R: Optimize an objective function under multiple blackbox...

optim.auglag {laGP}

R Documentation

Optimize an objective function under multiple blackbox constraints

Description

Uses a surrogate modeled augmented Lagrangian (AL) system to optimize an objective function (blackbox or known and linear) under unknown multiple (blackbox) constraints via expected improvement (EI) and variations; a comparator based on EI with constraints is also provided

Usage

optim.auglag(fn, B, fhat = FALSE, equal = FALSE, ethresh = 1e-2, 
  slack = FALSE, cknown = NULL, start = 10, end = 100, 
  Xstart = NULL, sep = TRUE, ab = c(3/2, 8), lambda = 1, rho = NULL, 
  urate = 10, ncandf = function(t) { t }, dg.start = c(0.1, 1e-06), 
  dlim = sqrt(ncol(B)) * c(1/100, 10), Bscale = 1, ey.tol = 1e-2, 
  N = 1000, plotprog = FALSE, verb = 2, ...)
optim.efi(fn, B, fhat = FALSE, cknown = NULL, start = 10, end = 100, 
  Xstart = NULL, sep = TRUE, ab = c(3/2,8), urate = 10, 
  ncandf = function(t) { t }, dg.start = c(0.1, 1e-6), 
  dlim = sqrt(ncol(B))*c(1/100,10), Bscale = 1, plotprog = FALSE, 
  verb = 2, ...)

Arguments

`fn`	function of an input (`x`), facilitating vectorization on a `matrix` `X` thereof, returning a `list` with elements `$obj` containing the (scalar) objective value and `$c` containing a vector of evaluations of the (multiple) constraint function at `x`. The `fn` function must take a `known.only` argument which is explained in the note below; it need not act on that argument
`B`	2-column `matrix` describing the bounding box. The number of rows of the `matrix` determines the input dimension (`length(x)` in `fn(x)`); the first column gives lower bounds and the second gives upper bounds
`fhat`	a scalar logical indicating if the objective function should be modeled with a GP surrogate. The default of `FALSE` assumes a known linear objective scaled by `Bscale`. Using `TRUE` is an “alpha” feature at this time
`equal`	an optional vector containing zeros and ones, whose length equals the number of constraints, specifying which should be treated as equality constraints (`0`) and which as inequality (`1`)
`ethresh`	a threshold used for equality constraints to determine validity for progress measures; ignored if there are no equality constraints
`slack`	A scalar logical indicating if slack variables, and thus exact EI calculations should be used. The default of `slack = FALSE` results in Monte Carlo EI approximation. One can optionally specify `slack = 2` to get the `slack = TRUE` behavior, with a second-stage L-BFGS-B optimization of the EI acquisition applied at the end, starting from the best value found on the random search grid
`cknown`	A optional positive integer vector specifying which of the constraint values returned by `fn` should be treated as “known”, i.e., not modeled with Gaussian processes
`start`	positive integer giving the number of random starting locations before sequential design (for optimization) is performed; `start >= 6` is recommended unless `Xstart` is non-`NULL`; in the current version the starting locations come from a space-filling design via `dopt.gp`
`end`	positive integer giving the total number of evaluations/trials in the optimization; must have `end > start`
`Xstart`	optional matrix of starting design locations in lieu of, or in addition to, `start` random ones; we recommend `nrow(Xstart) + start >= 6`; also must have `ncol(Xstart) = nrow(B)`
`sep`	The default `sep = TRUE` uses separable GPs (i.e., via `newGPsep`, etc.) to model the constraints and objective; otherwise the isotropic GPs are used
`ab`	prior parameters; see `darg` describing the prior used on the lengthscale parameter during emulation(s) for the constraints
`lambda`	`m`-dimensional initial Lagrange multiplier parameter for `m`-constraints
`rho`	positive scalar initial quadratic penalty parameter in the augmented Lagrangian; the default setting of `rho = NULL` causes an automatic starting value to be chosen; see rejoinder to Gramacy, et al. (2016) or supplementary material to Picheny, et al. (2016)
`urate`	positive integer indicating how many optimization trials should pass before each MLE/MAP update is performed for GP correlation lengthscale parameter(s)
`ncandf`	function taking a single integer indicating the optimization trial number `t`, where `start < t <= end`, and returning the number of search candidates (e.g., for expected improvement calculations) at round `t`; the default setting allows the number of candidates to grow linearly with `t`
`dg.start`	2-vector giving starting values for the lengthscale and nugget parameters of the GP surrogate model(s) for constraints
`dlim`	2-vector giving bounds for the lengthscale parameter(s) under MLE/MAP inference, thereby augmenting the prior specification in `ab`
`Bscale`	scalar indicating the relationship between the sum of the inputs, `sum(x)`, to `fn` and the output `fn(x)$obj`; note that at this time only linear objectives are fully supported by the code - more details below
`ey.tol`	a scalar proportion indicating how many of the EIs at `ncandf(t)` candidate locations must be non-zero to “trust” that metric to guide search, reducing to an EY-based search instead [choosing that proportion to be one forces EY-based search]
`N`	positive scalar integer indicating the number of Monte Carlo samples to be used for calculating EI and EY
`plotprog`	`logical` indicating if progress plots should be made after each inner iteration; the plots show three panels tracking the best valid objective, the EI or EY surface over the first two input variables (requires `interp`, and the parameters of the lengthscale(s) of the GP(s) respectively. When `plotprog = TRUE` the `interp.loess` function is used to aid in creating surface plots, however this does not work well with fewer than fifteen points. You may also provide a function as an argument, having similar arguments/formals as `interp.loess`. For example, we use `interp` below, which would have been the default if not for licensing incompatibilities
`verb`	a non-negative integer indicating the verbosity level; the larger the value the more that is printed to the screen
`...`	additional arguments passed to `fn`

Details

These subroutines support a suite of methods used to optimize challenging constrained problems from Gramacy, et al. (2016); and from Picheny, et al., (2016) see references below.

Those schemes hybridize Gaussian process based surrogate modeling and expected improvement (EI; Jones, et., al, 2008) with the additive penalty method (APM) implemented by the augmented Lagrangian (AL, e.g., Nocedal & Wright, 2006). The goal is to minimize a (possibly known) linear objective function f(x) under multiple, unknown (blackbox) constraint functions satisfying c(x) <= 0, which is vector-valued. The solution here emulates the components of c with Gaussian process surrogates, and guides optimization by searching the posterior mean surface, or the EI of, the following composite objective

Y(x) = f(x) + \lambda^\top Y_c(x) + \frac{1}{2\rho} \sum_{i=1}^m \max(0, Y_{c_i}(x))^2,

where \lambda and \rho follow updating equations that guarantee convergence to a valid solution minimizing the objective. For more details, see Gramacy, et al. (2016).

A slack variable implementation that allows for exact EI calculations and can accommodate equality constraints, and mixed (equality and inequality) constraints, is also provided. For further details, see Picheny, et al. (2016).

The example below illustrates a variation on the toy problem considered in both papers, which bases sampling on EI. For examples making used of equality constraints, following the Picheny, et al (2016) papers; see the demos listed in the “See Also” section below.

Although it is off by default, these functions allow an unknown objective to be modeled (fhat = TRUE), rather than assuming a known linear one. For examples see demo("ALfhat") and demo("GSBP") which illustrate the AL and comparators in inequality and mixed constraints setups, respectively.

The optim.efi function is provided as a comparator. This method uses the same underlying GP models to with the hybrid EI and probability of satisfying the constraints heuristic from Schonlau, et al., (1998). See demo("GSBP") and demo("LAH") for optim.efi examples and comparisons between the original AL, the slack variable enhancement(s) on mixed constraint problems with known and blackbox objectives, respectively

Value

The output is a list summarizing the progress of the evaluations of the blackbox under optimization

`prog`	vector giving the best valid (`c(x) < 0`) value of the objective over the trials
`obj`	vector giving the value of the objective for the input under consideration at each trial
`X`	`matrix` giving the input values at which the blackbox function was evaluated
`C`	`matrix` giving the value of the constraint function for the input under consideration at each trial
`d`	`matrix` of lengthscale values obtained at the final update of the GP emulator for each constraint
`df`	if `fhat = TRUE` then this is a `matrix` of lengthscale values for the objective obtained at the final update of the GP emulator
`lambda`	a `matrix` containing `lambda` vectors used in each “outer loop” AL iteration
`rho`	a vector of `rho` values used in each “outer loop” AL iteration

Note

This function is under active development, especially the newest features including separable GP surrogate modeling, surrogate modeling of a blackbox objective, and the use of slack variables for exact EI calculations and the support if equality constraints. Also note that, compared with earlier versions, it is now required to augment your blackbox function (fn) with an argument named known.only. This allows the user to specify if a potentially different object (with a subset of the outputs, those that are “known”) gets returned in certain circumstances. For example, the objective is treated as known in many of our examples. When a non-null cknown object is used, the known.only flag can be used to return only the outputs which are known.

Older versions of this function provided an argument called nomax. The NoMax feature is no longer supported

Author(s)

Robert B. Gramacy rbg@vt.edu

References

Gramacy, R. B. (2020) Surrogates: Gaussian Process Modeling, Design and Optimization for the Applied Sciences. Boca Raton, Florida: Chapman Hall/CRC. (See Chapter 7.) https://bobby.gramacy.com/surrogates/

Picheny, V., Gramacy, R.B., Wild, S.M., Le Digabel, S. (2016). “Bayesian optimization under mixed constraints with a slack-variable augmented Lagrangian”. Preprint available on arXiv:1605.09466; https://arxiv.org/abs/1605.09466

Gramacy, R.B, Gray, G.A, Lee, H.K.H, Le Digabel, S., Ranjan P., Wells, G., Wild, S.M. (2016) “Modeling an Augmented Lagrangian for Improved Blackbox Constrained Optimization”, Technometrics (with discussion), 58(1), 1-11. Preprint available on arXiv:1403.4890; https://arxiv.org/abs/1403.4890

Jones, D., Schonlau, M., and Welch, W. J. (1998). “Efficient Global Optimization of Expensive Black Box Functions.” Journal of Global Optimization, 13, 455-492.

Schonlau, M., Jones, D.R., and Welch, W. J. (1998). “Global Versus Local Search in Constrained Optimization of Computer Models.” In New Developments and Applications in Experimental Design, vol. 34, 11-25. Institute of Mathematical Statistics.

Nocedal, J. and Wright, S.J. (2006). Numerical Optimization. 2nd ed. Springer.

Examples

## this example assumes a known linear objective; further examples
## are in the optim.auglag demo

## a test function returning linear objective evaluations and 
## non-linear constraints
aimprob <- function(X, known.only = FALSE)
{
  if(is.null(nrow(X))) X <- matrix(X, nrow=1)
  f <- rowSums(X)
  if(known.only) return(list(obj=f))
  c1 <- 1.5-X[,1]-2*X[,2]-0.5*sin(2*pi*(X[,1]^2-2*X[,2]))
  c2 <- rowSums(X^2)-1.5
  return(list(obj=f, c=cbind(c1,c2)))
}

## set bounding rectangle for adaptive sampling
B <- matrix(c(rep(0,2),rep(1,2)),ncol=2)

## optimization (primarily) by EI, change 25 to 100 for
## 99% chance of finding the global optimum with value 0.6
if(require("interp")) { ## for plotprog=interp
  out <- optim.auglag(aimprob, B, end=25, plotprog=interp)
} else {
  out <- optim.auglag(aimprob, B, end=25)
}

## using the slack variable implementation which is a little slower
## but more precise; slack=2 augments random search with L-BFGS-B
  
out2 <- optim.auglag(aimprob, B, end=25, slack=TRUE)
## Not run: 
out3 <- optim.auglag(aimprob, B, end=25, slack=2)

## End(Not run)

## for more slack examples and comparison to optim.efi on problems
## involving equality and mixed (equality and inequality) constraints,
## see demo("ALfhat"), demo("GSBP") and demo("LAH")

## for comparison, here is a version that uses simulated annealing
## with the Additive Penalty Method (APM) for constraints
## Not run: 
aimprob.apm <- function(x, B=matrix(c(rep(0,2),rep(1,2)),ncol=2))
{ 
  ## check bounding box
  for(i in 1:length(x)) {
    if(x[i] < B[i,1] || x[i] > B[i,2]) return(Inf)
  }

  ## evaluate objective and constraints
  f <- sum(x)
  c1 <- 1.5-x[1]-2*x[2]-0.5*sin(2*pi*(x[1]^2-2*x[2]))
  c2 <- x[1]^2+x[2]^2-1.5

  ## return APM composite
  return(f + abs(c1) + abs(c2))
}

## use SA; specify control=list(maxit=100), say, to control max 
## number of iterations; does not easily facilitate plotting progress
out4 <- optim(runif(2), aimprob.apm, method="SANN") 
## check the final value, which typically does not satisfy both
## constraints
aimprob(out4$par)

## End(Not run)

## for a version with a modeled objective see demo("ALfhat")

[Package laGP version 1.5-9 Index]