bob.ip.binseg.engine.adabound¶
Implementation of the AdaBound optimizer <https://github.com/Luolc/AdaBound/blob/master/adabound/adabound.py>:
@inproceedings{Luo2019AdaBound,
author = {Luo, Liangchen and Xiong, Yuanhao and Liu, Yan and Sun, Xu},
title = {Adaptive Gradient Methods with Dynamic Bound of Learning Rate},
booktitle = {Proceedings of the 7th International Conference on Learning Representations},
month = {May},
year = {2019},
address = {New Orleans, Louisiana}
}
Classes
|
Implements the AdaBound algorithm. |
|
Implements AdaBound algorithm with Decoupled Weight Decay (See https://arxiv.org/abs/1711.05101) |
-
class
bob.ip.binseg.engine.adabound.
AdaBound
(params, lr=0.001, betas=(0.9, 0.999), final_lr=0.1, gamma=0.001, eps=1e-08, weight_decay=0, amsbound=False)[source]¶ Bases:
torch.optim.optimizer.Optimizer
Implements the AdaBound algorithm.
- Parameters
params (list) – Iterable of parameters to optimize or dicts defining parameter groups
lr (
float
, optional) – Adam learning ratebetas (
tuple
, optional) – Coefficients (as a 2-tuple of floats) used for computing running averages of gradient and its squarefinal_lr (
float
, optional) – Final (SGD) learning rategamma (
float
, optional) – Convergence speed of the bound functionseps (
float
, optional) – Term added to the denominator to improve numerical stabilityweight_decay (
float
, optional) – Weight decay (L2 penalty)amsbound (
bool
, optional) – Whether to use the AMSBound variant of this algorithm
-
class
bob.ip.binseg.engine.adabound.
AdaBoundW
(params, lr=0.001, betas=(0.9, 0.999), final_lr=0.1, gamma=0.001, eps=1e-08, weight_decay=0, amsbound=False)[source]¶ Bases:
torch.optim.optimizer.Optimizer
Implements AdaBound algorithm with Decoupled Weight Decay (See https://arxiv.org/abs/1711.05101)
- Parameters
params (list) – Iterable of parameters to optimize or dicts defining parameter groups
lr (
float
, optional) – Adam learning ratebetas (
tuple
, optional) – Coefficients (as a 2-tuple of floats) used for computing running averages of gradient and its squarefinal_lr (
float
, optional) – Final (SGD) learning rategamma (
float
, optional) – Convergence speed of the bound functionseps (
float
, optional) – Term added to the denominator to improve numerical stabilityweight_decay (
float
, optional) – Weight decay (L2 penalty)amsbound (
bool
, optional) – Whether to use the AMSBound variant of this algorithm