For training Densepose with 1 GPU, How was linear learning rate scaling rule applied? #2731

abhaydoke09 · 2021-03-10T22:02:34Z

abhaydoke09
Mar 10, 2021

Links to the relevant documentation/comment:
As per the instructions provided here : https://github.com/facebookresearch/detectron2/blob/master/projects/DensePose/doc/GETTING_STARTED.md, if I want to train with 1 GPU, a specific batch size of 2 and base learning of 0.0025 was chosen. How was linear learning rate scaling rule applied here? How did we end up with 0.0025 for batch size of 2?

abhaydoke09 · 2021-03-11T15:42:57Z

BASE_LR for https://github.com/facebookresearch/detectron2/blob/master/projects/DensePose/configs/Base-DensePose-RCNN-FPN.yaml is set to 0.01 when using 8 GPUS and IMS_PER_BATCH is set to 16. When we are training with 1 GPU with IMS_PER_BATCH = 2, after applying the linear learning rate schedule, learning rate should be set to 0.01/8 = 0.00125 instead of 0.0025 as per the readme.

0 replies