Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

larger learning rate + large weight decay performs better? #18

Open
askerlee opened this issue Oct 28, 2019 · 0 comments
Open

larger learning rate + large weight decay performs better? #18

askerlee opened this issue Oct 28, 2019 · 0 comments

Comments

@askerlee
Copy link

askerlee commented Oct 28, 2019

Hi all,
My colleague and I tried a combination of (relatively) large Ranger learning rate (say, 0.001) + large weight decay (say, 0.1). Seems the large decay leads to better performance? We tried two different models, and observed 0.5-1.5% increase of ImageNet classification accuracy, but both models were customized models, and not standard ones like Resnet.
Not sure whether anyone else finds similar results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant