oLBFGS_free {stochQN}R Documentation

oLBFGS Free-Mode Optimizer

Description

Optimizes an empirical (convex) loss function over batches of sample data. Compared to function/class 'oLBFGS', this version lets the user do all the calculations from the outside, only interacting with the object by means of a function that returns a request type and is fed the required calculation through a method 'update_gradient'.

Order in which requests are made:

========== loop ===========

* calc_grad

* calc_grad_same_batch (might skip if using check_nan)

===========================

After running this function, apply 'run_oLBFGS_free' to it to get the first requested piece of information.

Usage

oLBFGS_free(mem_size = 10, hess_init = NULL, min_curvature = 1e-04,
  y_reg = NULL, check_nan = TRUE, nthreads = -1)

Arguments

mem_size

Number of correction pairs to store for approximation of Hessian-vector products.

hess_init

Value to which to initialize the diagonal of H0. If passing 'NULL', will use the same initializion as for SQN ((s_last * y_last) / (y_last * y_last)).

min_curvature

Minimum value of (s * y) / (s * s) in order to accept a correction pair. Pass 'NULL' for no minimum.

y_reg

Regularizer for 'y' vector (gets added y_reg * s). Pass 'NULL' for no regularization.

check_nan

Whether to check for variables becoming NA after each iteration, and reverting the step if they do (will also reset BFGS memory).

nthreads

Number of parallel threads to use. If set to -1, will determine the number of available threads and use all of them. Note however that not all the computations can be parallelized, and the BLAS backend might use a different number of threads.

Value

An 'oLBFGS_free' object, which can be used through functions 'update_gradient' and 'run_oLBFGS_free'

References

See Also

update_gradient , run_oLBFGS_free

Examples

### Example optimizing Rosenbrock 2D function
### Note that this example is not stochastic, as the
### function is not evaluated in expectation based on
### batches of data, but rather it has a given absolute
### form that never varies.
### Warning: this optimizer is meant for convex functions
### (Rosenbrock's is not convex)
library(stochQN)


fr <- function(x) { ## Rosenbrock Banana function
	x1 <- x[1]
	x2 <- x[2]
	100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
grr <- function(x) { ## Gradient of 'fr'
	x1 <- x[1]
	x2 <- x[2]
	c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
	200 * (x2 - x1 * x1))
}

### Initial values of x
x_opt = as.numeric(c(0, 2))
cat(sprintf("Initial values of x: [%.3f, %.3f]\n",
	x_opt[1], x_opt[2]))
	
### Will use a constant step size throughout
### (not recommended)
step_size <- 1e-1

### Initialize the optimizer
optimizer <- oLBFGS_free()

### Keep track of the iteration number
curr_iter <- 0

### Run a loop for 100 iterations
### (Note that each iteration requires 2 calculations,
###  hence the 200)
for (i in 1:200) {
	req <- run_oLBFGS_free(optimizer, x_opt, step_size)
	if (req$task == "calc_grad") {
	  update_gradient(optimizer, grr(req$requested_on))
	} else if (req$task == "calc_grad_same_batch") {
	  update_gradient(optimizer, grr(req$requested_on))
	}
	
	### Track progress every 10 iterations
	if (req$info$iteration_number > curr_iter) {
	  curr_iter <- req$info$iteration_number
	  if ((curr_iter %% 10) == 0) {
	  cat(sprintf(
	   "Iteration %3d - Current function value: %.3f\n",
	  req$info$iteration_number, fr(x_opt)
	  ))
	  }
	}
}
cat(sprintf("Current values of x: [%.3f, %.3f]\n",
	x_opt[1], x_opt[2]))

[Package stochQN version 0.1.2-1 Index]