R: oLBFGS Free-Mode Optimizer

oLBFGS_free {stochQN}

R Documentation

oLBFGS Free-Mode Optimizer

Description

Optimizes an empirical (convex) loss function over batches of sample data. Compared to function/class 'oLBFGS', this version lets the user do all the calculations from the outside, only interacting with the object by means of a function that returns a request type and is fed the required calculation through a method 'update_gradient'.

Order in which requests are made:

========== loop ===========

* calc_grad

* calc_grad_same_batch (might skip if using check_nan)

===========================

After running this function, apply 'run_oLBFGS_free' to it to get the first requested piece of information.

Usage

oLBFGS_free(mem_size = 10, hess_init = NULL, min_curvature = 1e-04,
  y_reg = NULL, check_nan = TRUE, nthreads = -1)

Arguments

`mem_size`	Number of correction pairs to store for approximation of Hessian-vector products.
`hess_init`	Value to which to initialize the diagonal of H0. If passing 'NULL', will use the same initializion as for SQN ((s_last * y_last) / (y_last * y_last)).
`min_curvature`	Minimum value of (s * y) / (s * s) in order to accept a correction pair. Pass 'NULL' for no minimum.
`y_reg`	Regularizer for 'y' vector (gets added y_reg * s). Pass 'NULL' for no regularization.
`check_nan`	Whether to check for variables becoming NA after each iteration, and reverting the step if they do (will also reset BFGS memory).
`nthreads`	Number of parallel threads to use. If set to -1, will determine the number of available threads and use all of them. Note however that not all the computations can be parallelized, and the BLAS backend might use a different number of threads.

Value

An 'oLBFGS_free' object, which can be used through functions 'update_gradient' and 'run_oLBFGS_free'

References

Schraudolph, N.N., Yu, J. and Guenter, S., 2007, March. "A stochastic quasi-Newton method for online convex optimization." In Artificial Intelligence and Statistics (pp. 436-443).
Wright, S. and Nocedal, J., 1999. "Numerical optimization." (ch 7) Springer Science, 35(67-68), p.7.

Examples

### Example optimizing Rosenbrock 2D function
### Note that this example is not stochastic, as the
### function is not evaluated in expectation based on
### batches of data, but rather it has a given absolute
### form that never varies.
### Warning: this optimizer is meant for convex functions
### (Rosenbrock's is not convex)
library(stochQN)


fr <- function(x) { ## Rosenbrock Banana function
	x1 <- x[1]
	x2 <- x[2]
	100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
grr <- function(x) { ## Gradient of 'fr'
	x1 <- x[1]
	x2 <- x[2]
	c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
	200 * (x2 - x1 * x1))
}

### Initial values of x
x_opt = as.numeric(c(0, 2))
cat(sprintf("Initial values of x: [%.3f, %.3f]\n",
	x_opt[1], x_opt[2]))
	
### Will use a constant step size throughout
### (not recommended)
step_size <- 1e-1

### Initialize the optimizer
optimizer <- oLBFGS_free()

### Keep track of the iteration number
curr_iter <- 0

### Run a loop for 100 iterations
### (Note that each iteration requires 2 calculations,
###  hence the 200)
for (i in 1:200) {
	req <- run_oLBFGS_free(optimizer, x_opt, step_size)
	if (req$task == "calc_grad") {
	  update_gradient(optimizer, grr(req$requested_on))
	} else if (req$task == "calc_grad_same_batch") {
	  update_gradient(optimizer, grr(req$requested_on))
	}
	
	### Track progress every 10 iterations
	if (req$info$iteration_number > curr_iter) {
	  curr_iter <- req$info$iteration_number
	  if ((curr_iter %% 10) == 0) {
	  cat(sprintf(
	   "Iteration %3d - Current function value: %.3f\n",
	  req$info$iteration_number, fr(x_opt)
	  ))
	  }
	}
}
cat(sprintf("Current values of x: [%.3f, %.3f]\n",
	x_opt[1], x_opt[2]))

[Package stochQN version 0.1.2-1 Index]