It makes sense to use it on a batch of 1? #28

bratao · 2020-04-12T17:57:12Z

@lessw2020 Thanks for this awesome optimizer. I´m very excited about it!

There is one particular workload that trains using a batch of 1 item.
Theoretically, make sense to use RAdam (Rectified Adam), LookAhead, and GC in this context?

I´m thinking about it, read the papers but I still could not make a conclusion. As you (or any other person here) is much more experienced than me, do you have an option on this?

lessw2020 · 2020-04-12T20:17:50Z

Hi @bratao - it would still make sense to use, but my recommendation is to run with MABN - moving average batch norm.
This creates a moving average across batches and they show for example a batch size of 2 can get same accuracy as batch size 32, vs normally there is a large drop.
I am planning to test it out this week so I don't have proof it works yet but paper looks strong and idea is solid.
https://arxiv.org/abs/2001.06838

lessw2020 · 2020-04-12T20:19:32Z

Their code is linked there though it needs to likely be extractedout of their framework as I recall.
Anyway it's on my todo list and maybe can pull it out and make it a pluggable item.
Regardless that is the best way imo to address the batch size 1 issue
Hope that helps!
I'll leave this open to use to track my testing results on mabn and please post if you use it before I get to it :)

bratao · 2020-04-24T17:54:28Z

@lessw2020 I know that I´m just a beggar, but the first thing I do every morning is open this issue to check if you got to MABN.

Good vibes from an anxious fan ☮️

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

It makes sense to use it on a batch of 1? #28

It makes sense to use it on a batch of 1? #28

bratao commented Apr 12, 2020

lessw2020 commented Apr 12, 2020

lessw2020 commented Apr 12, 2020

bratao commented Apr 24, 2020

It makes sense to use it on a batch of 1? #28

It makes sense to use it on a batch of 1? #28

Comments

bratao commented Apr 12, 2020

lessw2020 commented Apr 12, 2020

lessw2020 commented Apr 12, 2020

bratao commented Apr 24, 2020