R: Implementation of FTRL (Follow The Regularized Leader)

FTRL {opera}

R Documentation

Implementation of FTRL (Follow The Regularized Leader)

Description

FTRL (Shalev-Shwartz and Singer 2007) and Chap. 5 of (Hazan 2019) is the online counterpart of empirical risk minimization. It is a family of aggregation rules (including OGD) that uses at any time the empirical risk minimizer so far with an additional regularization. The online optimization can be performed on any bounded convex set that can be expressed with equality or inequality constraints. Note that this method is still under development and a beta version.

Usage

FTRL(
  y,
  experts,
  eta = NULL,
  fun_reg = NULL,
  fun_reg_grad = NULL,
  constr_eq = NULL,
  constr_eq_jac = NULL,
  constr_ineq = NULL,
  constr_ineq_jac = NULL,
  loss.type = list(name = "square"),
  loss.gradient = TRUE,
  w0 = NULL,
  max_iter = 50,
  obj_tol = 0.01,
  training = NULL,
  default = FALSE,
  quiet = TRUE
)

Arguments

`y`	`vector`. Real observations.
`experts`	`matrix`. Matrix of experts previsions.
`eta`	`numeric`. Regularization parameter.
`fun_reg`	`function` (NULL). Regularization function to be applied during the optimization.
`fun_reg_grad`	`function` (NULL). Gradient of the regularization function (to speed up the computations).
`constr_eq`	`function` (NULL). Constraints (equalities) to be applied during the optimization.
`constr_eq_jac`	`function` (NULL). Jacobian of the equality constraints (to speed up the computations).
`constr_ineq`	`function` (NULL). Constraints (inequalities) to be applied during the optimization (... > 0).
`constr_ineq_jac`	`function` (NULL). Jacobian of the inequality constraints (to speed up the computations).
`loss.type`	`character, list or function` ("square"). character Name of the loss to be applied ('square', 'absolute', 'percentage', or 'pinball'); list List with field `name` equal to the loss name. If using pinball loss, field `tau` equal to the required quantile in [0,1]; function A custom loss as a function of two parameters (prediction, label).
`loss.gradient`	`boolean, function` (TRUE). boolean If TRUE, the aggregation rule will not be directly applied to the loss function at hand, but to a gradient version of it. The aggregation rule is then similar to gradient descent aggregation rule. function If loss.type is a function, the derivative of the loss in its first component should be provided to be used (it is not automatically computed).
`w0`	`numeric` (NULL). Vector of initialization for the weights.
`max_iter`	`integer` (50). Maximum number of iterations of the optimization algorithm per round.
`obj_tol`	`numeric` (1e-2). Tolerance over objective function between two iterations of the optimization.
`training`	`list` (NULL). List of previous parameters.
`default`	`boolean` (FALSE). Whether or not to use default parameters for fun_reg, constr_eq, constr_ineq and their grad/jac, which values are ALL ignored when TRUE.
`quiet`	`boolean` (FALSE). Whether or not to display progress bars.

Value

object of class mixture.

References

Hazan E (2019). “Introduction to online convex optimization.” arXiv preprint arXiv:1909.05207.

Shalev-Shwartz S, Singer Y (2007). “A primal-dual perspective of online learning algorithms.” Machine Learning, 69(2), 115–142.

[Package opera version 1.2.0 Index]