FTRL {opera} | R Documentation |
Implementation of FTRL (Follow The Regularized Leader)
Description
FTRL (Shalev-Shwartz and Singer 2007) and Chap. 5 of (Hazan 2019) is the online counterpart of empirical risk minimization.
It is a family of aggregation rules (including OGD) that uses at any time the empirical risk
minimizer so far with an additional regularization. The online optimization can be performed
on any bounded convex set that can be expressed with equality or inequality constraints.
Note that this method is still under development and a beta version.
Usage
FTRL(
y,
experts,
eta = NULL,
fun_reg = NULL,
fun_reg_grad = NULL,
constr_eq = NULL,
constr_eq_jac = NULL,
constr_ineq = NULL,
constr_ineq_jac = NULL,
loss.type = list(name = "square"),
loss.gradient = TRUE,
w0 = NULL,
max_iter = 50,
obj_tol = 0.01,
training = NULL,
default = FALSE,
quiet = TRUE
)
Arguments
y |
vector . Real observations.
|
experts |
matrix . Matrix of experts previsions.
|
eta |
numeric . Regularization parameter.
|
fun_reg |
function (NULL). Regularization function to be applied during the optimization.
|
fun_reg_grad |
function (NULL). Gradient of the regularization function (to speed up the computations).
|
constr_eq |
function (NULL). Constraints (equalities) to be applied during the optimization.
|
constr_eq_jac |
function (NULL). Jacobian of the equality constraints (to speed up the computations).
|
constr_ineq |
function (NULL). Constraints (inequalities) to be applied during the optimization (... > 0).
|
constr_ineq_jac |
function (NULL). Jacobian of the inequality constraints (to speed up the computations).
|
loss.type |
character, list or function ("square").
- character
Name of the loss to be applied ('square', 'absolute', 'percentage', or 'pinball');
- list
List with field name equal to the loss name. If using pinball loss, field tau equal to the required quantile in [0,1];
- function
A custom loss as a function of two parameters (prediction, label).
|
loss.gradient |
boolean, function (TRUE).
- boolean
If TRUE, the aggregation rule will not be directly applied to the loss function at hand,
but to a gradient version of it. The aggregation rule is then similar to gradient descent aggregation rule.
- function
If loss.type is a function, the derivative of the loss in its first component should be provided to be used (it is not automatically
computed).
|
w0 |
numeric (NULL). Vector of initialization for the weights.
|
max_iter |
integer (50). Maximum number of iterations of the optimization algorithm per round.
|
obj_tol |
numeric (1e-2). Tolerance over objective function between two iterations of the optimization.
|
training |
list (NULL). List of previous parameters.
|
default |
boolean (FALSE). Whether or not to use default parameters for fun_reg, constr_eq, constr_ineq and their grad/jac,
which values are ALL ignored when TRUE.
|
quiet |
boolean (FALSE). Whether or not to display progress bars.
|
Value
object of class mixture.
References
Hazan E (2019).
“Introduction to online convex optimization.”
arXiv preprint arXiv:1909.05207.
Shalev-Shwartz S, Singer Y (2007).
“A primal-dual perspective of online learning algorithms.”
Machine Learning, 69(2), 115–142.
[Package
opera version 1.2.0
Index]