You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi all,
My colleague and I tried a combination of (relatively) large Ranger learning rate (say, 0.001) + large weight decay (say, 0.1). Seems the large decay leads to better performance? We tried two different models, and observed 0.5-1.5% increase of ImageNet classification accuracy, but both models were customized models, and not standard ones like Resnet.
Not sure whether anyone else finds similar results.
The text was updated successfully, but these errors were encountered:
Hi all,
My colleague and I tried a combination of (relatively) large Ranger learning rate (say, 0.001) + large weight decay (say, 0.1). Seems the large decay leads to better performance? We tried two different models, and observed 0.5-1.5% increase of ImageNet classification accuracy, but both models were customized models, and not standard ones like Resnet.
Not sure whether anyone else finds similar results.
The text was updated successfully, but these errors were encountered: