optimizer_lamb {tfaddons} | R Documentation |
Layer-wise Adaptive Moments
Description
Layer-wise Adaptive Moments
Usage
optimizer_lamb(
learning_rate = 0.001,
beta_1 = 0.9,
beta_2 = 0.999,
epsilon = 1e-06,
weight_decay_rate = 0,
exclude_from_weight_decay = NULL,
exclude_from_layer_adaptation = NULL,
name = "LAMB",
clipnorm = NULL,
clipvalue = NULL,
decay = NULL,
lr = NULL
)
Arguments
learning_rate |
A 'Tensor' or a floating point value. or a schedule that is a 'tf$keras$optimizers$schedules$LearningRateSchedule' The learning rate. |
beta_1 |
A 'float' value or a constant 'float' tensor. The exponential decay rate for the 1st moment estimates. |
beta_2 |
A 'float' value or a constant 'float' tensor. The exponential decay rate for the 2nd moment estimates. |
epsilon |
A small constant for numerical stability. |
weight_decay_rate |
weight decay rate. |
exclude_from_weight_decay |
List of regex patterns of variables excluded from weight decay. Variables whose name contain a substring matching the pattern will be excluded. |
exclude_from_layer_adaptation |
List of regex patterns of variables excluded from layer adaptation. Variables whose name contain a substring matching the pattern will be excluded. |
name |
Optional name for the operations created when applying gradients. Defaults to "LAMB". |
clipnorm |
is clip gradients by norm. |
clipvalue |
is clip gradients by value. |
decay |
is included for backward compatibility to allow time inverse decay of learning rate. |
lr |
is included for backward compatibility, recommended to use learning_rate instead. |
Value
Optimizer for use with 'keras::compile()'
Examples
## Not run:
keras_model_sequential() %>%
layer_dense(32, input_shape = c(784)) %>%
compile(
optimizer = optimizer_lamb(),
loss='binary_crossentropy',
metrics='accuracy'
)
## End(Not run)