ao {ao} | R Documentation |
Alternating Optimization
Description
Alternating optimization is an iterative procedure for optimizing a real-valued function jointly over all its parameters by alternating restricted optimization over parameter partitions.
Usage
ao(
f,
initial,
target = NULL,
npar = NULL,
gradient = NULL,
...,
partition = "sequential",
new_block_probability = 0.3,
minimum_block_number = 2,
minimize = TRUE,
lower = -Inf,
upper = Inf,
iteration_limit = Inf,
seconds_limit = Inf,
tolerance_value = 1e-06,
tolerance_parameter = 1e-06,
tolerance_parameter_norm = function(x, y) sqrt(sum((x - y)^2)),
tolerance_history = 1,
base_optimizer = Optimizer$new("stats::optim", method = "L-BFGS-B"),
verbose = FALSE,
hide_warnings = TRUE
)
Arguments
f |
( The first argument of If |
initial |
( This can also be a |
target |
( This can only be Can be |
npar |
( Must be specified if more than two target arguments are specified via
the Can be |
gradient |
( The function call of Can be |
... |
Additional arguments to be passed to |
partition |
(
This can also be a |
new_block_probability |
( The probability for a new parameter block when creating a random partitions. Values close to 0 result in larger parameter blocks, values close to 1 result in smaller parameter blocks. |
minimum_block_number |
( The minimum number of blocks in random partitions. |
minimize |
( If |
lower , upper |
( |
iteration_limit |
( Can also be |
seconds_limit |
( Can also be Note that this stopping criteria is only checked after a sub-problem is solved and not within solving a sub-problem, so the actual process time can exceed this limit. |
tolerance_value |
( Can be |
tolerance_parameter |
( Can be By default, the distance is measured using the euclidean norm, but another
norm can be specified via the |
tolerance_parameter_norm |
( It must be of the form |
tolerance_history |
( |
base_optimizer |
( By default, the This can also be a |
verbose |
( |
hide_warnings |
( |
Details
Multiple threads
Alternating optimization can suffer from local optima. To increase the likelihood of reaching the global optimum, you can specify:
multiple starting parameters
multiple parameter partitions
multiple base optimizers
Use the initial
, partition
, and/or base_optimizer
arguments to provide
a list
of possible values for each parameter. Each combination of initial
values, parameter partitions, and base optimizers will create a separate
alternating optimization thread.
Output value
In the case of multiple threads, the output changes slightly in comparison
to the standard case. It is still a list
with the following elements:
-
estimate
is the optimal parameter vector over all threads. -
value
is the optimal function value over all threads. -
details
combines details of the single threads and has an additional columnthread
with an index for the different threads. -
seconds
gives the computation time in seconds for each thread. -
stopping_reason
gives the termination message for each thread. -
threads
give details how the different threads were specified.
Parallel computation
By default, threads run sequentially. However, since they are independent,
they can be parallelized. To enable parallel computation, use the
{future}
framework. For example, run the
following before the ao()
call:
future::plan(future::multisession, workers = 4)
Progress updates
When using multiple threads, setting verbose = TRUE
to print tracing
details during alternating optimization is not supported. However, you can
still track the progress of threads using the
{progressr}
framework. For example,
run the following before the ao()
call:
progressr::handlers(global = TRUE) progressr::handlers( progressr::handler_progress(":percent :eta :message") )
Value
A list
with the following elements:
-
estimate
is the parameter vector at termination. -
value
is the function value at termination. -
details
is adata.frame
with full information about the procedure: For each iteration (columniteration
) it contains the function value (columnvalue
), parameter values (columns starting withp
followed by the parameter index), the active parameter block (columns starting withb
followed by the parameter index, where1
stands for a parameter contained in the active parameter block and0
if not), and computation times in seconds (columnseconds
) -
seconds
is the overall computation time in seconds. -
stopping_reason
is a message why the procedure has terminated.
In the case of multiple threads, the output changes slightly, see details.
Examples
# Example 1: Minimization of Himmelblau's function --------------------------
himmelblau <- function(x) (x[1]^2 + x[2] - 11)^2 + (x[1] + x[2]^2 - 7)^2
ao(f = himmelblau, initial = c(0, 0))
# Example 2: Maximization of 2-class Gaussian mixture log-likelihood --------
# target arguments:
# - class means mu (2, unrestricted)
# - class standard deviations sd (2, must be non-negative)
# - class proportion lambda (only 1 for identification, must be in [0, 1])
normal_mixture_llk <- function(mu, sd, lambda, data) {
c1 <- lambda * dnorm(data, mu[1], sd[1])
c2 <- (1 - lambda) * dnorm(data, mu[2], sd[2])
sum(log(c1 + c2))
}
ao(
f = normal_mixture_llk,
initial = c(2, 4, 1, 1, 0.5),
target = c("mu", "sd", "lambda"),
npar = c(2, 2, 1),
data = datasets::faithful$eruptions,
partition = "random",
minimize = FALSE,
lower = c(-Inf, -Inf, 0, 0, 0),
upper = c(Inf, Inf, Inf, Inf, 1)
)