optim_adahessian {torchopt} | R Documentation |
Adahessian optimizer
Description
R implementation of the Adahessian optimizer proposed by Yao et al.(2020). The original implementation is available at https://github.com/amirgholami/adahessian.
Usage
optim_adahessian(
params,
lr = 0.15,
betas = c(0.9, 0.999),
eps = 1e-04,
weight_decay = 0,
hessian_power = 0.5
)
Arguments
params |
Iterable of parameters to optimize. |
lr |
Learning rate (default: 0.15). |
betas |
Coefficients for computing running averages of gradient and is square(default: (0.9, 0.999)). |
eps |
Term added to the denominator to improve numerical stability (default: 1e-4). |
weight_decay |
L2 penalty (default: 0). |
hessian_power |
Hessian power (default: 1.0). |
Value
An optimizer object implementing the step
and zero_grad
methods.
Author(s)
Rolf Simoes, rolf.simoes@inpe.br
Felipe Souza, lipecaso@gmail.com
Alber Sanchez, alber.ipia@inpe.br
Gilberto Camara, gilberto.camara@inpe.br
References
Yao, Z., Gholami, A., Shen, S., Mustafa, M., Keutzer, K., & Mahoney, M. (2021). ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 35(12), 10665-10673. https://arxiv.org/abs/2006.00719