psqn_bfgs {psqn} | R Documentation |
BFGS Implementation Used Internally in the psqn Package
Description
The method seems to mainly differ from optim
by the line search
method. This version uses the interpolation method with a zoom phase
using cubic interpolation as described by Nocedal and Wright (2006).
Usage
psqn_bfgs(
par,
fn,
gr,
rel_eps = 1e-08,
max_it = 100L,
c1 = 1e-04,
c2 = 0.9,
trace = 0L,
env = NULL,
gr_tol = -1,
abs_eps = -1
)
Arguments
par |
Initial values for the parameters. |
fn |
Function to evaluate the function to be minimized. |
gr |
Gradient of |
rel_eps |
Relative convergence threshold. |
max_it |
Maximum number of iterations. |
c1 |
Thresholds for the Wolfe condition. |
c2 |
Thresholds for the Wolfe condition. |
trace |
Integer where larger values gives more information during the optimization. |
env |
Environment to evaluate |
gr_tol |
Convergence tolerance for the Euclidean norm of the gradient. A negative value yields no check. |
abs_eps |
Absolute convergence threshold. A negative values yields no check. |
Value
An object like the object returned by psqn
.
References
Nocedal, J. and Wright, S. J. (2006). Numerical Optimization (2nd ed.). Springer.
Examples
# declare function and gradient from the example from help(optim)
fn <- function(x) {
x1 <- x[1]
x2 <- x[2]
100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
gr <- function(x) {
x1 <- x[1]
x2 <- x[2]
c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
200 * (x2 - x1 * x1))
}
# we need a different function for the method in this package
gr_psqn <- function(x) {
x1 <- x[1]
x2 <- x[2]
out <- c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
200 * (x2 - x1 * x1))
attr(out, "value") <- 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
out
}
# we get the same
optim (c(-1.2, 1), fn, gr, method = "BFGS")
psqn_bfgs(c(-1.2, 1), fn, gr_psqn)
# compare the computation time
system.time(replicate(1000,
optim (c(-1.2, 1), fn, gr, method = "BFGS")))
system.time(replicate(1000,
psqn_bfgs(c(-1.2, 1), fn, gr_psqn)))
# we can use an alternative convergence criterion
org <- psqn_bfgs(c(-1.2, 1), fn, gr_psqn, rel_eps = 1e-4)
sqrt(sum(gr_psqn(org$par)^2))
new_res <- psqn_bfgs(c(-1.2, 1), fn, gr_psqn, rel_eps = 1e-4, gr_tol = 1e-8)
sqrt(sum(gr_psqn(new_res$par)^2))
new_res <- psqn_bfgs(c(-1.2, 1), fn, gr_psqn, rel_eps = 1, abs_eps = 1e-2)
new_res$value - org$value # ~ there (but this is not guaranteed)