bob.ip.binseg.engine.adabound

Implementation of the AdaBound optimizer <https://github.com/Luolc/AdaBound/blob/master/adabound/adabound.py>:

@inproceedings{Luo2019AdaBound,
  author = {Luo, Liangchen and Xiong, Yuanhao and Liu, Yan and Sun, Xu},
  title = {Adaptive Gradient Methods with Dynamic Bound of Learning Rate},
  booktitle = {Proceedings of the 7th International Conference on Learning Representations},
  month = {May},
  year = {2019},
  address = {New Orleans, Louisiana}
}

Classes

AdaBound(params[, lr, betas, final_lr, …])

Implements the AdaBound algorithm.

AdaBoundW(params[, lr, betas, final_lr, …])

Implements AdaBound algorithm with Decoupled Weight Decay (See https://arxiv.org/abs/1711.05101)

class bob.ip.binseg.engine.adabound.AdaBound(params, lr=0.001, betas=(0.9, 0.999), final_lr=0.1, gamma=0.001, eps=1e-08, weight_decay=0, amsbound=False)[source]

Bases: torch.optim.optimizer.Optimizer

Implements the AdaBound algorithm.

Parameters
  • params (list) – Iterable of parameters to optimize or dicts defining parameter groups

  • lr (float, optional) – Adam learning rate

  • betas (tuple, optional) – Coefficients (as a 2-tuple of floats) used for computing running averages of gradient and its square

  • final_lr (float, optional) – Final (SGD) learning rate

  • gamma (float, optional) – Convergence speed of the bound functions

  • eps (float, optional) – Term added to the denominator to improve numerical stability

  • weight_decay (float, optional) – Weight decay (L2 penalty)

  • amsbound (bool, optional) – Whether to use the AMSBound variant of this algorithm

step(closure=None)[source]

Performs a single optimization step.

Parameters

closure (callable, optional) – A closure that reevaluates the model and returns the loss.

class bob.ip.binseg.engine.adabound.AdaBoundW(params, lr=0.001, betas=(0.9, 0.999), final_lr=0.1, gamma=0.001, eps=1e-08, weight_decay=0, amsbound=False)[source]

Bases: torch.optim.optimizer.Optimizer

Implements AdaBound algorithm with Decoupled Weight Decay (See https://arxiv.org/abs/1711.05101)

Parameters
  • params (list) – Iterable of parameters to optimize or dicts defining parameter groups

  • lr (float, optional) – Adam learning rate

  • betas (tuple, optional) – Coefficients (as a 2-tuple of floats) used for computing running averages of gradient and its square

  • final_lr (float, optional) – Final (SGD) learning rate

  • gamma (float, optional) – Convergence speed of the bound functions

  • eps (float, optional) – Term added to the denominator to improve numerical stability

  • weight_decay (float, optional) – Weight decay (L2 penalty)

  • amsbound (bool, optional) – Whether to use the AMSBound variant of this algorithm

step(closure=None)[source]

Performs a single optimization step.

Parameters

closure (callable, optional) – A closure that reevaluates the model and returns the loss.