optim_adahessian {torchopt}R Documentation

Adahessian optimizer

Description

R implementation of the Adahessian optimizer proposed by Yao et al.(2020). The original implementation is available at https://github.com/amirgholami/adahessian.

Usage

optim_adahessian(
  params,
  lr = 0.15,
  betas = c(0.9, 0.999),
  eps = 1e-04,
  weight_decay = 0,
  hessian_power = 0.5
)

Arguments

params

Iterable of parameters to optimize.

lr

Learning rate (default: 0.15).

betas

Coefficients for computing running averages of gradient and is square(default: (0.9, 0.999)).

eps

Term added to the denominator to improve numerical stability (default: 1e-4).

weight_decay

L2 penalty (default: 0).

hessian_power

Hessian power (default: 1.0).

Value

An optimizer object implementing the step and zero_grad methods.

Author(s)

Rolf Simoes, rolf.simoes@inpe.br

Felipe Souza, lipecaso@gmail.com

Alber Sanchez, alber.ipia@inpe.br

Gilberto Camara, gilberto.camara@inpe.br

References

Yao, Z., Gholami, A., Shen, S., Mustafa, M., Keutzer, K., & Mahoney, M. (2021). ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 35(12), 10665-10673. https://arxiv.org/abs/2006.00719


[Package torchopt version 0.1.4 Index]