About `NoamLR` #7

ShaneTian · 2022-03-06T04:46:52Z

Lines 36 to 39 in a3e63b3

    
           def get_lr(self): 
        
               last_epoch = max(1, self.last_epoch) 
        
               scale = self.warmup_steps ** 0.5 * min(last_epoch ** (-0.5), last_epoch * self.warmup_steps ** (-1.5)) 
        
               return [base_lr * scale for base_lr in self.base_lrs]

The custom NoamLR results in same LR in the first two steps, like:

warmup_steps = 10
lr = 0.01

step -- before (current step LR) -- after (next step LR)
0 -- 0.001 -- 0.001
1 -- 0.001 -- 0.002
2 -- 0.002 -- 0.003

There are two ways to fix this:

last_epoch = self.last_epoch + 1, like ESPnet

def get_lr(self): 
    last_epoch = self.last_epoch + 1
    scale = self.warmup_steps ** 0.5 * min(last_epoch ** (-0.5), last_epoch * self.warmup_steps ** (-1.5)) 
    return [base_lr * scale for base_lr in self.base_lrs]

use LambdaLR directly

noam_scale = lambda epoch: (warmup_steps ** 0.5) * min((epoch + 1) ** -0.5, (epoch + 1) * (warmup_steps ** -1.5))
scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda=noam_scale)

Of course, the above two approaches are equivalent.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About `NoamLR` #7

About `NoamLR` #7

ShaneTian commented Mar 6, 2022

About NoamLR #7

About NoamLR #7

Comments

ShaneTian commented Mar 6, 2022

About `NoamLR` #7

About `NoamLR` #7