ao {ao}R Documentation

Alternating Optimization

Description

Alternating optimization is an iterative procedure for optimizing a real-valued function jointly over all its parameters by alternating restricted optimization over parameter partitions.

Usage

ao(
  f,
  initial,
  target = NULL,
  npar = NULL,
  gradient = NULL,
  ...,
  partition = "sequential",
  new_block_probability = 0.3,
  minimum_block_number = 2,
  minimize = TRUE,
  lower = -Inf,
  upper = Inf,
  iteration_limit = Inf,
  seconds_limit = Inf,
  tolerance_value = 1e-06,
  tolerance_parameter = 1e-06,
  tolerance_parameter_norm = function(x, y) sqrt(sum((x - y)^2)),
  tolerance_history = 1,
  base_optimizer = Optimizer$new("stats::optim", method = "L-BFGS-B"),
  verbose = FALSE,
  hide_warnings = TRUE
)

Arguments

f

(function)
A function to be optimized, returning a single numeric value.

The first argument of f should be a numeric of the same length as initial, optionally followed by any other arguments specified by the ... argument.

If f is to be optimized over an argument other than the first, or more than one argument, this has to be specified via the target argument.

initial

(numeric() or list())
The starting parameter values for the target argument(s).

This can also be a list of multiple starting parameter values, see details.

target

(character() or NULL)
The name(s) of the argument(s) over which f gets optimized.

This can only be numeric arguments.

Can be NULL (default), then it is the first argument of f.

npar

(integer())
The length of the target argument(s).

Must be specified if more than two target arguments are specified via the target argument.

Can be NULL if there is only one target argument, in which case npar is set to be length(initial).

gradient

(function or NULL)
A function that returns the gradient of f.

The function call of gradient must be identical to f.

Can be NULL, in which case a finite-difference approximation will be used.

...

Additional arguments to be passed to f (and gradient).

partition

(character(1) or list())
Defines the parameter partition, and can be either

  • "sequential" for treating each parameter separately,

  • "random" for a random partition in each iteration,

  • "none" for no partition (which is equivalent to joint optimization),

  • or a list of vectors of parameter indices, specifying a custom partition for the alternating optimization process.

This can also be a list of multiple partition definitions, see details.

new_block_probability

(numeric(1))
Only relevant if partition = "random".

The probability for a new parameter block when creating a random partitions.

Values close to 0 result in larger parameter blocks, values close to 1 result in smaller parameter blocks.

minimum_block_number

(integer(1))
Only relevant if partition = "random".

The minimum number of blocks in random partitions.

minimize

(logical(1))
Whether to minimize during the alternating optimization process.

If FALSE, maximization is performed.

lower, upper

(numeric())
Optionally lower and upper parameter bounds.

iteration_limit

(integer(1) or Inf)
The maximum number of iterations through the parameter partition before the alternating optimization process is terminated.

Can also be Inf for no iteration limit.

seconds_limit

(numeric(1))
The time limit in seconds before the alternating optimization process is terminated.

Can also be Inf for no time limit.

Note that this stopping criteria is only checked after a sub-problem is solved and not within solving a sub-problem, so the actual process time can exceed this limit.

tolerance_value

(numeric(1))
A non-negative tolerance value. The alternating optimization terminates if the absolute difference between the current function value and the one before tolerance_history iterations is smaller than tolerance_value.

Can be 0 for no value threshold.

tolerance_parameter

(numeric(1))
A non-negative tolerance value. The alternating optimization terminates if the distance between the current estimate and the before tolerance_history iterations is smaller than tolerance_parameter.

Can be 0 for no parameter threshold.

By default, the distance is measured using the euclidean norm, but another norm can be specified via the tolerance_parameter_norm argument.

tolerance_parameter_norm

(function)
The norm that measures the distance between the current estimate and the one from the last iteration. If the distance is smaller than tolerance_parameter, the procedure is terminated.

It must be of the form function(x, y) for two vector inputs x and y, and return a single numeric value. By default, the euclidean norm function(x, y) sqrt(sum((x - y)^2)) is used.

tolerance_history

(integer(1))
The number of iterations to look back to determine whether tolerance_value or tolerance_parameter has been reached.

base_optimizer

(Optimizer or list())
An Optimizer object, which can be created via Optimizer. It numerically solves the sub-problems.

By default, the optim optimizer is used. If another optimizer is specified, the arguments gradient, lower, and upper are ignored.

This can also be a list of multiple base optimizers, see details.

verbose

(logical(1))
Whether to print tracing details during the alternating optimization process.

hide_warnings

(logical(1))
Whether to hide warnings during the alternating optimization process.

Details

Multiple threads

Alternating optimization can suffer from local optima. To increase the likelihood of reaching the global optimum, you can specify:

Use the initial, partition, and/or base_optimizer arguments to provide a list of possible values for each parameter. Each combination of initial values, parameter partitions, and base optimizers will create a separate alternating optimization thread.

Output value

In the case of multiple threads, the output changes slightly in comparison to the standard case. It is still a list with the following elements:

Parallel computation

By default, threads run sequentially. However, since they are independent, they can be parallelized. To enable parallel computation, use the {future} framework. For example, run the following before the ao() call:

future::plan(future::multisession, workers = 4)
Progress updates

When using multiple threads, setting verbose = TRUE to print tracing details during alternating optimization is not supported. However, you can still track the progress of threads using the {progressr} framework. For example, run the following before the ao() call:

progressr::handlers(global = TRUE)
progressr::handlers(
  progressr::handler_progress(":percent :eta :message")
)

Value

A list with the following elements:

In the case of multiple threads, the output changes slightly, see details.

Examples

# Example 1: Minimization of Himmelblau's function --------------------------

himmelblau <- function(x) (x[1]^2 + x[2] - 11)^2 + (x[1] + x[2]^2 - 7)^2
ao(f = himmelblau, initial = c(0, 0))

# Example 2: Maximization of 2-class Gaussian mixture log-likelihood --------

# target arguments:
# - class means mu (2, unrestricted)
# - class standard deviations sd (2, must be non-negative)
# - class proportion lambda (only 1 for identification, must be in [0, 1])

normal_mixture_llk <- function(mu, sd, lambda, data) {
  c1 <- lambda * dnorm(data, mu[1], sd[1])
  c2 <- (1 - lambda) * dnorm(data, mu[2], sd[2])
  sum(log(c1 + c2))
}

ao(
  f = normal_mixture_llk,
  initial = c(2, 4, 1, 1, 0.5),
  target = c("mu", "sd", "lambda"),
  npar = c(2, 2, 1),
  data = datasets::faithful$eruptions,
  partition = "random",
  minimize = FALSE,
  lower = c(-Inf, -Inf, 0, 0, 0),
  upper = c(Inf, Inf, Inf, Inf, 1)
)


[Package ao version 1.1.0 Index]