Skip to content

Commit 193aee5

Browse files
committed
useful diagram
1 parent 0bf3bf3 commit 193aee5

File tree

4 files changed

+7
-5
lines changed

4 files changed

+7
-5
lines changed

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
1+
<img src="./adam-atan2.png" width="400px"></img>
2+
13
## Adam-atan2 - Pytorch
24

35
Implementation of the proposed <a href="https://arxiv.org/abs/2407.05872">Adam-atan2</a> optimizer in Pytorch
46

5-
A multi-million dollar paper out of google deepmind basically proposes a small change to Adam update rule (using `atan2`) for greater stability
7+
A multi-million dollar paper out of google deepmind proposes a small change to Adam update rule (using `atan2`) to remove the epsilon altogether for numerical stability and scale invariance
68

79
## Install
810

adam-atan2.png

453 KB
Loading

adam_atan2_pytorch/adam_atan2.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -83,11 +83,11 @@ def step(
8383
exp_avg_sq.lerp_(grad * grad, 1. - beta2)
8484

8585
# the following line is the proposed change to the update rule
86-
# using atan2 instead of a division with epsilons - they also suggest hyperparameters `a` and `b` should be explored beyond its default of 1.
86+
# using atan2 instead of a division with epsilon in denominator
8787

88-
update = a * atan2(exp_avg / bias_correct1, b * sqrt(exp_avg_sq / bias_correct2))
88+
update = atan2(exp_avg / bias_correct1, b * sqrt(exp_avg_sq / bias_correct2))
8989

90-
p.add_(update, alpha = -lr)
90+
p.add_(update, alpha = -lr * a)
9191

9292
# increment steps
9393

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "adam-atan2-pytorch"
3-
version = "0.0.2"
3+
version = "0.0.3"
44
description = "Adam-atan2 for Pytorch"
55
authors = [
66
{ name = "Phil Wang", email = "lucidrains@gmail.com" }

0 commit comments

Comments
 (0)