Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is adabelief the best optimizer? #44

Open
LifeIsStrange opened this issue Oct 23, 2020 · 7 comments
Open

Is adabelief the best optimizer? #44

LifeIsStrange opened this issue Oct 23, 2020 · 7 comments

Comments

@LifeIsStrange
Copy link

https://paperswithcode.com/paper/adabelief-optimizer-adapting-stepsizes-by-the

@LifeIsStrange
Copy link
Author

"This work considers the update step in first-order methods. Other directions include Lookahead [42] which updates “fast” and “slow” weights separately, and is a wrapper that can combine with other optimizers; variance reduction methods [43, 44, 45] which reduce the variance in gradient; and LARS [46] which uses a layer-wise learning rate scaling. AdaBelief can be combined with these methods. Other variants of Adam have been proposed (e.g. NosAdam [47], Sadam [48] and Adax [49])."

@hiyyg
Copy link

hiyyg commented Dec 28, 2020

I tested adabelief on my task, it is worse than ranger.

@juntang-zhuang
Copy link

@hiyyg Could you post your task, network, and hyper-params of two optimizers for your task?

@hiyyg
Copy link

hiyyg commented Aug 8, 2021

It was an internal task, sorry I can not share it. The hyper params are all the default for both optimizers.

@juntang-zhuang
Copy link

juntang-zhuang commented Aug 8, 2021

@hiyyg which version of adabelief did you use? Not sure if it's caused by eps, quickly skimming over the ranger code, default uses eps=1e-5, equivalent to eps=1e-10 for AdaBelief. The most recent (0.2) default eps is 1e-16 for AdaBelief, equivalent to an eps=1e-8 for Adam. The difference in eps is crucial for adaptive optimizers, this could be the reason causing the performance difference.

@hiyyg
Copy link

hiyyg commented Aug 8, 2021

Thanks. I guess I used the version around 28 Dec 2020. I think your information might be very useful for users who want to compare Adabelief with Ranger.

@juntang-zhuang
Copy link

juntang-zhuang commented Aug 8, 2021

Thanks for the info. 28 Dec 2020 is about v0.1 and the default eps=1e-16 for AdaBelief

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants